← All SDKs

Now Available

Pendra Python SDK

OpenAI-compatible Python client for sovereign UK inference. Sync and async, with streaming support. Python 3.10+.

Installation

Terminal
$ pip install pendra

Requires Python 3.10 or later. View on PyPI

Quick Start

python
from pendra import Pendra

client = Pendra(
    api_key="pdr_sk_...",   # or set PENDRA_API_KEY env var
)

response = client.chat.completions.create(
    model="llama3.2",
    messages=[{
        "role": "user",
        "content": "What is the capital of the UK?"
    }]
)

print(response.choices[0].message.content)

Streaming

Stream responses token by token using a context manager.

python
with client.chat.completions.create(
    model="llama3.2",
    messages=[{"role": "user", "content": "Write a poem about London."}],
    stream=True,
) as stream:
    for chunk in stream:
        print(chunk.choices[0].delta.content or "", end="", flush=True)

Async Client

Use AsyncPendra for asyncio applications. Supports both streaming and non-streaming.

python
import asyncio
from pendra import AsyncPendra

async def main():
    async with AsyncPendra(api_key="pdr_sk_...") as client:
        response = await client.chat.completions.create(
            model="llama3.2",
            messages=[{"role": "user", "content": "Hello!"}]
        )
        print(response.choices[0].message.content)

asyncio.run(main())

Image Generation

Generate images from a text prompt. Returns base64-encoded PNGs by default — decode and save to disk, or set response_format="url" when supported by the model.

python
import base64

response = client.images.generations.create(
    model="x/z-image-turbo",
    prompt="A red London double-decker bus at sunset",
    size="1024x1024",
)

with open("bus.png", "wb") as f:
    f.write(base64.b64decode(response.data[0].b64_json))

Use AsyncPendra for async applications:

python
async with AsyncPendra(api_key="pdr_sk_...") as client:
    response = await client.images.generations.create(
        model="x/z-image-turbo",
        prompt="A red London double-decker bus at sunset",
    )

Image generation is non-streaming — the endpoint returns a single JSON response once the worker finishes.

Embeddings

Generate vector embeddings for retrieval, search, and RAG pipelines. OpenAI-compatible — pass a string or a list of strings and get back a CreateEmbeddingResponse with one embedding per input.

python
response = client.embeddings.create(
    model="nomic-embed-text:latest",
    input=["The quick brown fox", "jumps over the lazy dog"],
)

for item in response.data:
    print(item.index, len(item.embedding), "dims")

print(response.usage.prompt_tokens)

Any embedding model in the Pendra model catalogue works — nomic-embed-text, mxbai-embed-large, bge-m3, qwen3-embedding, all-minilm. Also available on AsyncPendra via await client.embeddings.create(...).

List Models

Query available models from your Pendra instance.

python
models = client.models.list()

for model in models:
    print(model.id)

Migrating from OpenAI

The Pendra SDK mirrors the OpenAI interface. Two lines to switch — your existing code just works.

python
# Before
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After
from pendra import Pendra
client = Pendra(api_key="pdr_sk_...")

API Reference

client.chat.completions.create()

Create a chat completion. Returns ChatCompletion or Stream.

ParameterTypeDescription
modelstrModel ID (e.g. "llama3.2")
messageslist[dict]Chat messages with role and content
streamboolEnable streaming (default False)
temperaturefloat?Sampling temperature (0–2)
max_tokensint?Maximum tokens to generate
top_pfloat?Top-p sampling value
stopstr | list?Stop sequence(s)

client.images.generations.create()

Generate images from a text prompt. Returns ImageResponse. Also available as await client.images.generations.create(...) on AsyncPendra.

ParameterTypeDescription
modelstrImage model ID (e.g. "x/z-image-turbo")
promptstrText description of the image to generate
nint?Number of images, 1–4 (default 1)
sizestr?Dimensions as WIDTHxHEIGHT (default "1024x1024")
response_formatstr?"b64_json" (default) or "url"
num_inference_stepsint?Diffusion steps (model-dependent)
seedint?Random seed for reproducibility
negative_promptstr?Text to avoid in the generated image

client.embeddings.create()

Create embeddings. Returns CreateEmbeddingResponse. Also await client.embeddings.create(...) on AsyncPendra.

ParameterTypeDescription
modelstrEmbedding model ID (e.g. "nomic-embed-text:latest")
inputstr | list[str]Text to embed. Accepts a single string or a batch.
encoding_formatstr?"float" (default) or "base64"
dimensionsint?Output dimensionality (Matryoshka models like nomic-embed-text)
userstr?Optional end-user identifier for abuse monitoring

client.models.list()

Returns a list of Model objects available on the instance. Each model has id, object, created, and owned_by fields.

Environment Variables

VariableDescription
PENDRA_API_KEYYour API key (pdr_sk_...). Used when no api_key is passed to the constructor.

Error Handling

All exceptions inherit from pendra.APIError.

ExceptionStatusWhen
AuthenticationError401Invalid or missing API key
RateLimitError429Too many requests
APIStatusError4xx/5xxAny other non-2xx response
APIConnectionErrorNetwork or connection failure