SDKs

Node.js SDK

The Pendra Node SDK is a TypeScript-first, zero-dependency client for sovereign UK inference. Streaming support, dual ESM/CJS. Node.js 18+.

Installation

bash

$ npm install pendra

Zero runtime dependencies. View on npm.

Quick start

quickstart.ts

import Pendra from 'pendra';

const client = new Pendra({
  apiKey: 'pdr_sk_...', // or set PENDRA_API_KEY env var
});

const response = await client.chat.completions.create({
  model: 'qwen3.6:27b',
  messages: [{ role: 'user', content: 'Hello!' }],
});

console.log(response.choices[0].message.content);

Streaming

Stream responses token by token using async iteration.

stream.ts

const stream = await client.chat.completions.create({
  model: 'qwen3.6:27b',
  messages: [{ role: 'user', content: 'Write a poem' }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content ?? '');
}

Image generation

Generate images from a text prompt. Returns base64-encoded PNGs by default — decode with Buffer.from(b64, 'base64') and write to disk, or set response_format: 'url' when supported by the model.

images.ts

import { writeFileSync } from 'node:fs';

const response = await client.images.generations.create({
  model: 'x/z-image-turbo',
  prompt: 'A red London double-decker bus at sunset',
  size: '1024x1024',
});

const b64 = response.data[0].b64_json;
if (b64) {
  writeFileSync('bus.png', Buffer.from(b64, 'base64'));
}

Image generation is non-streaming — the endpoint returns a single JSON response once the worker finishes.

Embeddings

Generate vector embeddings for retrieval, search, and RAG pipelines. OpenAI-compatible — pass a string or array of strings and get back a CreateEmbeddingResponse with one embedding per input.

embeddings.ts

const response = await client.embeddings.create({
  model: 'nomic-embed-text:latest',
  input: ['The quick brown fox', 'jumps over the lazy dog'],
});

for (const item of response.data) {
  console.log(item.index, (item.embedding as number[]).length, 'dims');
}

console.log(response.usage.prompt_tokens);

Any embedding model in the Pendra catalogue works — nomic-embed-text, mxbai-embed-large, bge-m3, qwen3-embedding, all-minilm.

Audio transcription

Transcribe audio to text using Whisper-class models. Multipart upload — pass a Blob/File directly, or a { filename, content } object where content is a Blob, ArrayBuffer, or Uint8Array. Files capped at 25 MB.

transcribe.ts

import { readFileSync } from 'node:fs';

const audio = readFileSync('meeting.mp3');
const result = await client.audio.transcriptions.create({
  file: { filename: 'meeting.mp3', content: audio },
  model: 'whisper-large-v3-turbo',
  language: 'en',
});

console.log(result.text);

In the browser (or Bun), pass the File from an <input type="file"> straight through. Set response_format: 'srt' or 'vtt' to get subtitles back instead of JSON:

transcribe.browser.ts

// Browser / Bun: pass a Blob or File directly
const input = document.querySelector('input[type=file]');
const file = input.files[0]; // a File (which extends Blob)

const result = await client.audio.transcriptions.create({
  file,
  model: 'whisper-large-v3-turbo',
  response_format: 'srt', // or 'vtt' for subtitles
});

console.log(result.text); // SRT string — text/srt/vtt are wrapped as { text }

Transcription is non-streaming. result.duration and result.language are populated when the backend reports them; result.segments appears when response_format: 'verbose_json'.

List models

models.ts

const models = await client.models.list();
models.forEach((m) => console.log(m.id));

Migrating from OpenAI

The Pendra SDK mirrors the OpenAI interface. Two lines to switch — your existing code just works.

migration.ts

// Before
import OpenAI from 'openai';
const client = new OpenAI({ apiKey: 'sk-...' });

// After
import Pendra from 'pendra';
const client = new Pendra({ apiKey: 'pdr_sk_...' });

API reference

`client.chat.completions.create()`

Create a chat completion. Returns ChatCompletion or Stream.

Parameter	Type	Description
`model`	`string`	Model ID (e.g. "qwen3.6:27b")
`messages`	`Array`	Chat messages with role and content
`stream`	`boolean?`	Enable streaming (default false)
`temperature`	`number?`	Sampling temperature (0–2)
`max_tokens`	`number?`	Maximum tokens to generate
`top_p`	`number?`	Top-p sampling value
`stop`	`string \| string[]?`	Stop sequence(s)

`client.images.generations.create()`

Generate images from a text prompt. Returns Promise<ImageResponse>.

Parameter	Type	Description
`model`	`string`	Image model ID (e.g. "x/z-image-turbo")
`prompt`	`string`	Text description of the image to generate
`n`	`number?`	Number of images, 1–4 (default 1)
`size`	`string?`	Dimensions as WIDTHxHEIGHT (default "1024x1024")
`response_format`	`string?`	"b64_json" (default) or "url"
`num_inference_steps`	`number?`	Diffusion steps (model-dependent)
`seed`	`number?`	Random seed for reproducibility
`negative_prompt`	`string?`	Text to avoid in the generated image

`client.embeddings.create()`

Create embeddings. Returns Promise<CreateEmbeddingResponse>.

Parameter	Type	Description
`model`	`string`	Embedding model ID (e.g. "nomic-embed-text:latest")
`input`	`string \| string[]`	Text to embed. Accepts a single string or a batch.
`encoding_format`	`"float" \| "base64"?`	Defaults to float
`dimensions`	`number?`	Output dimensionality (Matryoshka models like nomic-embed-text)
`user`	`string?`	Optional end-user identifier

`client.models.list()`

Returns an array of Model objects. Each model has id, object, created, and owned_by fields.

`client.audio.transcriptions.create()`

Transcribe an audio file. Returns TranscriptionResponse.

Parameter	Type	Description
`file`	`AudioFileInput`	`Blob`/`File`, or `{ filename, content }`. ≤ 25 MB.
`model`	`string`	Transcription model id (e.g. "whisper-large-v3-turbo")
`language`	`string?`	ISO-639-1 language hint, optional
`prompt`	`string?`	Biasing prompt (vocabulary, formatting), optional
`response_format`	`string?`	"json" (default), "text", "srt", "vtt", or "verbose_json"
`temperature`	`number?`	Sampling temperature 0.0–1.0
`timestamp_granularities`	`("word" \| "segment")[]?`	verbose_json only

 Configuration
  Option Env var Default
  apiKey PENDRA_API_KEY —
 baseURL — https://api.pendra.ai
 timeout — 120000 (ms)
 
 
 Error handling
 All exceptions extend APIError.
  Exception Status When
  AuthenticationError 401 Invalid or missing API key
 RateLimitError 429 Too many requests
 APIStatusError 4xx/5xx Any other non-2xx response
 APIConnectionError — Network or connection failure
 
 
 Links
  npm
 Get an API key
 API reference

Option	Env var	Default
`apiKey`	`PENDRA_API_KEY`	—
`baseURL`	—	`https://api.pendra.ai`
`timeout`	—	`120000` (ms)

Exception	Status	When
`AuthenticationError`	401	Invalid or missing API key
`RateLimitError`	429	Too many requests
`APIStatusError`	4xx/5xx	Any other non-2xx response
`APIConnectionError`	—	Network or connection failure