Backends

Ollama

Ollama is the recommended default backend for Pendra. It has full model install and uninstall support, and it auto-discovers cleanly on every supported OS.

What's supported

CapabilityStatus
Chat completions
Embeddings
Image generation
Audio transcription
Model install
Model uninstall

Connection

  • Default port: 11434
  • Auto-discovery probes: http://localhost:11434, then http://host.docker.internal:11434
  • Verification: the worker calls /api/version and /api/tags; both must return the expected Ollama-shaped JSON before Ollama activates.
  • Override: set OLLAMA_ENDPOINT in worker config.

Installing Ollama

Pendra does not install Ollama itself — install it from ollama.com/download on the same machine as the worker. Once Ollama is running and serving on port 11434, the worker discovers it on the next refresh tick.

Model discovery

The worker lists models from GET /api/tags and enriches each one with a parallel GET /api/show call (max 5 concurrent), capturing:

  • Parameter size and quantisation level
  • Format (GGUF), family
  • Context length
  • Capabilities array (e.g. embedding, vision)
  • Disk size

Context window

Ollama defaults to a 2048-token context window per request, regardless of what the underlying model supports. For models that report a much larger window via /api/show (e.g. qwen3.6:27b at 262144), this means longer conversations get silently truncated and the model loses task context mid-session.

The worker fixes this for you: when it dispatches a chat completion to Ollama, it injects options.num_ctx using the context length captured during model discovery. An explicit options.num_ctx sent by the client always wins. The injection only applies to chat completions — embedding, image, and transcription requests are forwarded unchanged.

Curated installs

Any catalogue model whose backend is ollama (or both) can be installed from the console:

  1. console.pendra.aiModels.
  2. Pick a catalogued model + variant (size / quantisation).
  3. Choose the destination worker. Progress streams live.

Uninstall works the same way — click Remove on an installed model and Pendra calls DELETE /api/delete on the worker's Ollama.

Related