API

Embeddings

POST /api/v1/embeddings generates dense vector embeddings. The request and response shape matches OpenAI's /v1/embeddings exactly.

Request

curl
curl https://api.pendra.ai/api/v1/embeddings \
  -H "Authorization: Bearer pdr_sk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nomic-embed-text:latest",
    "input": ["The quick brown fox", "jumps over the lazy dog"]
  }'

Fields

  • model — embedding model ID. List options at /models?type=embedding.
  • input — a single string or an array of strings.

Python

embed.py
from pendra import Pendra

client = Pendra()
result = client.embeddings.create(
    model="nomic-embed-text:latest",
    input=["the quick brown fox", "jumps over the lazy dog"],
)
print(result.data[0].embedding[:5])

Response

OpenAI-shaped list envelope. data contains one embedding entry per input string, preserving input order via index. Vectors are returned as JSON arrays of floats by default — set encoding_format: "base64" for a smaller wire size on big batches.

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0131, -0.0442, 0.0921, ..., -0.0073]
    },
    {
      "object": "embedding",
      "index": 1,
      "embedding": [0.0204, -0.0118, 0.0815, ..., 0.0026]
    }
  ],
  "model": "nomic-embed-text:latest",
  "usage": {
    "prompt_tokens": 10,
    "total_tokens": 10
  }
}

Notes

  • The request timeout is ~60 seconds. Batch large workloads in groups of a few hundred strings rather than a single huge request.
  • Embedding dimensionality and similarity scaling depend on the model — check the model catalogue.
  • Embedding requests appear under Embeddings in the console usage view.