SDKs
Node.js SDK
The Pendra Node SDK is a TypeScript-first, zero-dependency client for sovereign UK inference. Streaming support, dual ESM/CJS. Node.js 18+.
Installation
$ npm install pendra Zero runtime dependencies. View on npm.
Quick start
import Pendra from 'pendra';
const client = new Pendra({
apiKey: 'pdr_sk_...', // or set PENDRA_API_KEY env var
});
const response = await client.chat.completions.create({
model: 'qwen3.6:27b',
messages: [{ role: 'user', content: 'Hello!' }],
});
console.log(response.choices[0].message.content); Streaming
Stream responses token by token using async iteration.
const stream = await client.chat.completions.create({
model: 'qwen3.6:27b',
messages: [{ role: 'user', content: 'Write a poem' }],
stream: true,
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
} Image generation
Generate images from a text prompt. Returns base64-encoded PNGs by
default — decode with Buffer.from(b64, 'base64') and write
to disk, or set response_format: 'url' when supported by the
model.
import { writeFileSync } from 'node:fs';
const response = await client.images.generations.create({
model: 'x/z-image-turbo',
prompt: 'A red London double-decker bus at sunset',
size: '1024x1024',
});
const b64 = response.data[0].b64_json;
if (b64) {
writeFileSync('bus.png', Buffer.from(b64, 'base64'));
} Image generation is non-streaming — the endpoint returns a single JSON response once the worker finishes.
Embeddings
Generate vector embeddings for retrieval, search, and RAG pipelines.
OpenAI-compatible — pass a string or array of strings and get back a
CreateEmbeddingResponse with one embedding per input.
const response = await client.embeddings.create({
model: 'nomic-embed-text:latest',
input: ['The quick brown fox', 'jumps over the lazy dog'],
});
for (const item of response.data) {
console.log(item.index, (item.embedding as number[]).length, 'dims');
}
console.log(response.usage.prompt_tokens);
Any embedding model in the
Pendra catalogue
works — nomic-embed-text, mxbai-embed-large,
bge-m3, qwen3-embedding,
all-minilm.
Audio transcription
Transcribe audio to text using Whisper-class models. Multipart upload —
pass a Blob/File directly, or a
{ filename, content } object where content is a
Blob, ArrayBuffer, or Uint8Array.
Files capped at 25 MB.
import { readFileSync } from 'node:fs';
const audio = readFileSync('meeting.mp3');
const result = await client.audio.transcriptions.create({
file: { filename: 'meeting.mp3', content: audio },
model: 'whisper-large-v3-turbo',
language: 'en',
});
console.log(result.text);
In the browser (or Bun), pass the File from an
<input type="file"> straight through. Set
response_format: 'srt' or 'vtt' to get
subtitles back instead of JSON:
// Browser / Bun: pass a Blob or File directly
const input = document.querySelector('input[type=file]');
const file = input.files[0]; // a File (which extends Blob)
const result = await client.audio.transcriptions.create({
file,
model: 'whisper-large-v3-turbo',
response_format: 'srt', // or 'vtt' for subtitles
});
console.log(result.text); // SRT string — text/srt/vtt are wrapped as { text }
Transcription is non-streaming. result.duration and
result.language are populated when the backend reports them;
result.segments appears when
response_format: 'verbose_json'.
List models
const models = await client.models.list();
models.forEach((m) => console.log(m.id)); Migrating from OpenAI
The Pendra SDK mirrors the OpenAI interface. Two lines to switch — your existing code just works.
// Before
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: 'sk-...' });
// After
import Pendra from 'pendra';
const client = new Pendra({ apiKey: 'pdr_sk_...' }); API reference
client.chat.completions.create()
Create a chat completion. Returns ChatCompletion or Stream.
| Parameter | Type | Description |
|---|---|---|
model | string | Model ID (e.g. "qwen3.6:27b") |
messages | Array | Chat messages with role and content |
stream | boolean? | Enable streaming (default false) |
temperature | number? | Sampling temperature (0–2) |
max_tokens | number? | Maximum tokens to generate |
top_p | number? | Top-p sampling value |
stop | string | string[]? | Stop sequence(s) |
client.images.generations.create()
Generate images from a text prompt. Returns Promise<ImageResponse>.
| Parameter | Type | Description |
|---|---|---|
model | string | Image model ID (e.g. "x/z-image-turbo") |
prompt | string | Text description of the image to generate |
n | number? | Number of images, 1–4 (default 1) |
size | string? | Dimensions as WIDTHxHEIGHT (default "1024x1024") |
response_format | string? | "b64_json" (default) or "url" |
num_inference_steps | number? | Diffusion steps (model-dependent) |
seed | number? | Random seed for reproducibility |
negative_prompt | string? | Text to avoid in the generated image |
client.embeddings.create()
Create embeddings. Returns Promise<CreateEmbeddingResponse>.
| Parameter | Type | Description |
|---|---|---|
model | string | Embedding model ID (e.g. "nomic-embed-text:latest") |
input | string | string[] | Text to embed. Accepts a single string or a batch. |
encoding_format | "float" | "base64"? | Defaults to float |
dimensions | number? | Output dimensionality (Matryoshka models like nomic-embed-text) |
user | string? | Optional end-user identifier |
client.models.list()
Returns an array of Model objects. Each model has id,
object, created, and owned_by
fields.
client.audio.transcriptions.create()
Transcribe an audio file. Returns TranscriptionResponse.
| Parameter | Type | Description |
|---|---|---|
file | AudioFileInput | Blob/File, or { filename, content }. ≤ 25 MB. |
model | string | Transcription model id (e.g. "whisper-large-v3-turbo") |
language | string? | ISO-639-1 language hint, optional |
prompt | string? | Biasing prompt (vocabulary, formatting), optional |
response_format | string? | "json" (default), "text", "srt", "vtt", or "verbose_json" |
temperature | number? | Sampling temperature 0.0–1.0 |
timestamp_granularities | ("word" | "segment")[]? | verbose_json only |
Configuration
Option Env var Default apiKeyPENDRA_API_KEY— baseURL— https://api.pendra.ai timeout— 120000 (ms)
Error handling
All exceptions extend APIError.
Exception Status When AuthenticationError401 Invalid or missing API key RateLimitError429 Too many requests APIStatusError4xx/5xx Any other non-2xx response APIConnectionError— Network or connection failure
Links