API

Responses API (OpenAI Codex)

Pendra implements the OpenAI Responses API at /v1/responses (also aliased at /api/v1/responses) so the OpenAI Codex CLI works without modification when pointed at Pendra.

Endpoint

POST https://api.pendra.ai/v1/responses
POST https://api.pendra.ai/api/v1/responses    # alias

Request

curl
curl https://api.pendra.ai/v1/responses \
  -H "Authorization: Bearer pdr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen3.6:27b",
    "input": "Write a one-line summary of UK GDPR."
  }'

Response

Pendra returns the OpenAI Responses envelope. output is an array of items; each message item carries an array of content blocks. status is completed on a clean finish, or incomplete when the model hits a stop condition before exhausting max_output_tokens.

{
  "id": "resp_01HZ8b...",
  "object": "response",
  "created_at": 1715346400,
  "status": "completed",
  "model": "qwen3.6:27b",
  "output": [
    {
      "type": "message",
      "id": "msg_01",
      "role": "assistant",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "text": "UK GDPR governs how organisations process personal data of UK residents, post-Brexit."
        }
      ]
    }
  ],
  "usage": {
    "input_tokens": 12,
    "output_tokens": 18,
    "total_tokens": 30
  }
}

Model mapping

Codex hard-codes OpenAI model names like gpt-5-codex and gpt-5.5. Pendra falls back to an available chat model on your worker pool when an unknown OpenAI model name is requested, so Codex works out of the box. To pin a specific Pendra model, set model in your Codex config — see Integrations → Codex.

Streaming

The Responses API uses its own event taxonomy (response.output_text.delta, response.completed, etc.). Pendra emits these events from streamed chat completions; the Codex CLI consumes them directly.

event: response.created
data: {"type":"response.created","response":{"id":"resp_01HZ8b","object":"response","status":"in_progress","model":"qwen3.6:27b"}}

event: response.output_item.added
data: {"type":"response.output_item.added","output_index":0,"item":{"type":"message","id":"msg_01","role":"assistant","status":"in_progress","content":[]}}

event: response.content_part.added
data: {"type":"response.content_part.added","item_id":"msg_01","output_index":0,"content_index":0,"part":{"type":"output_text","text":""}}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_01","output_index":0,"content_index":0,"delta":"Hello"}

event: response.output_text.delta
data: {"type":"response.output_text.delta","item_id":"msg_01","output_index":0,"content_index":0,"delta":"!"}

event: response.output_text.done
data: {"type":"response.output_text.done","item_id":"msg_01","output_index":0,"content_index":0,"text":"Hello!"}

event: response.completed
data: {"type":"response.completed","response":{"id":"resp_01HZ8b","status":"completed","usage":{"input_tokens":8,"output_tokens":2,"total_tokens":10}}}

Quick start

The full Codex setup — config file, env vars, and a model pin — lives in Integrations → Codex.