API Reference

Models & catalogue

Pendra exposes two model surfaces: a live models endpoint that lists what's actually serving right now across connected workers, and a catalogue endpoint that lists everything Pendra can install for you via the curated-install flow.

Live models

GET /api/v1/models returns the OpenAI-shaped models list:

curl https://api.pendra.ai/api/v1/models \
  -H "Authorization: Bearer pdr_sk_..."

from pendra import Pendra

client = Pendra()
for m in client.models.list():
    print(m.id)

import Pendra from 'pendra';

const client = new Pendra();
const models = await client.models.list();
models.forEach((m) => console.log(m.id));

Response

Standard OpenAI list envelope. Each entry is what's actually serving on a connected worker right now — restart a worker and the list updates within seconds.

{
  "object": "list",
  "data": [
    {
      "id": "llama3.3:70b",
      "object": "model",
      "created": 1733519426,
      "owned_by": "pendra"
    },
    {
      "id": "qwen3.6:27b",
      "object": "model",
      "created": 1776313891,
      "owned_by": "pendra"
    },
    {
      "id": "nomic-embed-text",
      "object": "model",
      "created": 1707947709,
      "owned_by": "pendra"
    }
  ]
}

created is the model's release date, in Unix epoch seconds. It's 0 for the occasional model we don't have a date for.

Filter by type

Pass ?type= to narrow the list to a single capability: chat, image, embedding, rerank, ocr, or transcription.

curl

curl "https://api.pendra.ai/api/v1/models?type=embedding" \
  -H "Authorization: Bearer pdr_sk_..."

Codex compatibility

OpenAI Codex sends a client_version query parameter and expects a different envelope ({"models": [...]} instead of the standard OpenAI list). Pendra detects Codex and returns the right shape automatically — no configuration needed.

The model catalogue

GET /api/v1/catalogue is a public endpoint (no auth required) listing every model Pendra can install on your worker via one click in the console. Catalogue installs download a verified GGUF onto the worker. Every entry is vetted by Pendra before it appears in the catalogue.

curl

curl https://api.pendra.ai/api/v1/catalogue

Catalogue response

Returns a flat list of catalogue entries. Each entry describes a canonical Pendra model plus its variants (sizes and quantisations), and each variant carries the verified GGUF that Pendra installs.

{
  "data": [
    {
      "id": "gemma4",
      "name": "Gemma 4",
      "publisher": "Google DeepMind",
      "family": "gemma4_unified",
      "description": "...",
      "capabilities": ["completion", "vision", "thinking", "tools"],
      "context_length": 262144,
      "homepage": "https://...",
      "licence": "apache-2.0",
      "reasoning": "reasoning",
      "variants": [
        {
          "id": "gemma4:12b",
          "label": "12B",
          "parameter_size": "12B",
          "architecture": "dense",
          "quantization": "Q4_K_M",
          "context_length": 262144,
          "disk_size": 7300000000,
          "gguf_url": "https://huggingface.co/...",
          "gguf_sha256": "...",
          "gguf_size_bytes": 7300000000,
          "task": "chat",
          "installable_via": ["dashboard", "cli"]
        },
        {
          "id": "gemma4:26b",
          "label": "26B",
          "parameter_size": "26B",
          "architecture": "moe",
          "active_parameters": "4B",
          "quantization": "Q4_K_M",
          "context_length": 262144,
          "disk_size": 16000000000,
          "gguf_url": "https://huggingface.co/...",
          "gguf_sha256": "...",
          "gguf_size_bytes": 16000000000,
          "task": "chat",
          "installable_via": ["dashboard", "cli"]
        }
      ]
    },
    {
      "id": "nomic-embed-text",
      "name": "nomic-embed-text-v1.5",
      "publisher": "Nomic",
      "family": "nomic-embed-text",
      "description": "...",
      "capabilities": ["embedding"],
      "context_length": 2048,
      "homepage": "https://...",
      "licence": "apache-2.0",
      "variants": [
        {
          "id": "nomic-embed-text:nomic-embed-text-v1.5",
          "label": "v1.5",
          "quantization": "Q4_K_M",
          "disk_size": 84106624,
          "gguf_url": "https://huggingface.co/...",
          "gguf_sha256": "...",
          "gguf_size_bytes": 84106624,
          "task": "embedding",
          "installable_via": ["dashboard", "cli"]
        }
      ]
    }
  ]
}

What's in a catalogue entry

id — canonical Pendra ID (e.g. qwen3.5).
capabilities — an array describing what the model can do: completion, vision, tools, thinking, embedding, rerank, ocr, image.
variants — sizes and quantisations, each carrying the GGUF install metadata: gguf_url + gguf_sha256 (the verified GGUF Pendra downloads and checks), gguf_size_bytes, disk_size (total on-disk footprint in bytes, including the vision projector and every shard where applicable), and installable_via (e.g. ["dashboard", "cli"]).
gguf_parts — present instead of gguf_url on very large models whose weights ship as several files (sharded GGUF, e.g. a 235B MoE or a frontier model). It's an ordered list of { url, sha256, size_bytes } parts; Pendra downloads and verifies every part, then loads the model from the first one. You don't need to handle this differently — installing the variant works the same way.
architecture — on each chat variant: dense, moe (Mixture-of-Experts), hybrid, or diffusion. MoE variants also carry active_parameters (the per-token count, e.g. "3B" for a 35B model — see the architecture guide). Embedding, transcription, and image models omit these.
reasoning — at the model level: reasoning (thinks before answering) or hybrid (thinking you can toggle on or off per request). Absent on non-reasoning models.
parameter size, context length, licence.

Curated installs

Every catalogued variant installs onto the worker — one click in the console, or pendra models install <id> from the CLI. The installable_via field on each variant tells you where it can be installed from (dashboard, cli, or both).

Naming conventions

Chat models use the name:tag convention (e.g. qwen3.5:0.8b, llama3.3:70b). Embedding models use a slug-style id (e.g. nomic-embed-text:nomic-embed-text-v1.5). The full set of names lives in the catalogue endpoint above.