Backends
Backend capabilities
The Pendra worker ships with a built-in inference backend — just install the worker and you can start serving chat completions from the curated catalogue. You can optionally connect external backends (Ollama, vLLM, LM Studio, Speaches) on the same machine when you need capabilities the built-in backend doesn't cover, such as image generation or audio transcription.
Capability matrix
The most important columns are Model install and Model uninstall: these describe whether Pendra can install a model from our curated catalogue into that backend with one click from the console, or remove it the same way. For external backends, Pendra never installs the backend itself — once you have it running, model lifecycle becomes Pendra's job where the backend supports it.
| Backend | Chat | Embed | Image | Transcribe | Model install | Model uninstall |
|---|---|---|---|---|---|---|
| Pendra (built-in) | ✓ | ✓ | — | — | ✓ | ✓ |
| Ollama | ✓ | ✓ | ✓ | — | ✓ | ✓ |
| LM Studio | ✓ | ✓ | — | — | ✓ | via LM Studio app |
| Speaches | — | — | — | ✓ | ✓ | ✓ |
| vLLM | ✓ | ✓ | — | — | — | — |
What "model install" means
Pendra ships with a curated catalogue of vetted open-source models (Llama, Qwen, Mistral, gpt-oss, Phi, Nomic embeddings, Whisper variants, image models, etc.). For backends in the table that support model install, you can:
- Browse the catalogue at console.pendra.ai → Models.
- Click Install on any catalogue model.
- Pick a destination worker (and, for cross-backend models, a target backend — Pendra, Ollama, or LM Studio).
- Install progress streams live in the console.
The same flow drives uninstall where the backend supports it (Pendra, Ollama, and Speaches today). LM Studio models must currently be removed inside the LM Studio app — that's a limitation of LM Studio's API, not Pendra.
Auto-discovery (external backends)
When no explicit endpoint is configured for an external backend, the
worker probes http://localhost:<port> first, then
http://host.docker.internal:<port>. The first endpoint
that responds and passes a backend-specific
Verify() wins. The Docker fallback means a worker running in
a container can reach a backend on the host without extra config. The
Pendra backend is in-process and doesn't need discovery.
Disabling a backend
Set its endpoint env var (or config key) to off, 0,
false, none, or no. Empty string is
not "disabled" — that's the auto-discover signal. See
Configuration for details.