Pricing

Start building. Scale when you're ready.

Every plan includes zero data retention, UK jurisdiction, and OpenAI SDK compatibility. Bring your own GPUs with a free account, or let us manage everything.

Free

£0 /month

Get started with sovereign inference on your own hardware.

  • 1 self-hosted Pendra Worker
  • OpenAI-compatible API endpoint
  • Zero data retention
  • UK legal jurisdiction
  • Python, Node.js, Go, .NET, and Rust SDKs
  • Community documentation
Get Started

Pro

£99 /month

For teams running real workloads across multiple machines.

  • Everything in Free, plus:
  • Up to 5 self-hosted Pendra Workers
  • Enhanced usage analytics
  • Worker monitoring & webhook alerts
  • Standard DPA included
  • Priority support (email)
  • Request logging (optional)
Get Started

Enterprise

Custom

Managed GPUs, advanced security, and dedicated support.

  • Everything in Pro, plus:
  • Pendra-managed GPU infrastructure
  • Unlimited self-hosted Workers
  • E2E encrypted worker communications
  • Automatic data masking and redaction
  • Granular audit logging
  • Custom DPA and DPIA support
  • Dedicated account manager
  • SLA with uptime guarantee
  • SSO and role-based access control
  • Onboarding and integration support
Talk to Us

Plan details

Compare plans

Free
Pro
Enterprise

Infrastructure

Self-hosted Workers Run inference on your own GPU hardware via the Pendra orchestration layer.
1
Up to 5
Unlimited
OpenAI SDK compatibility Drop-in replacement — swap your base URL and use existing OpenAI client code.
SDKs Native client libraries for Python, Node.js, Go, .NET, and Rust.
Pendra-managed GPUs Dedicated GPU clusters in UK data centres, fully managed by Pendra.

Operations

Enhanced usage analytics Request volume, latency percentiles, model usage breakdowns, and per-key tracking.
Worker monitoring & webhook alerts Health dashboard with notifications for worker offline events and error rate spikes.
Optional request logging Opt-in logging of requests and responses. Off by default — you choose what to capture.
Priority support Faster response times from the Pendra engineering team.
Email
Dedicated

Security & Compliance

Zero data retention Prompts and completions processed in RAM and never written to disc. Architectural, not policy.
UK jurisdiction All infrastructure on UK soil, operated by a UK entity. Outside the US CLOUD Act.
DPA Data Processing Agreement for UK GDPR compliance. Standard template or custom-negotiated.
Standard
Custom
E2E encrypted worker comms End-to-end encryption between the Pendra control plane and your workers.
Auto data masking Automatic redaction of PII and sensitive data before it reaches the model.
Audit logging Granular system-level logs of all API activity and configuration changes.
DPIA support Pendra provides input and documentation for your Data Protection Impact Assessments.
SSO & RBAC Single sign-on integration and role-based access control for your organisation.
SLA Contractual uptime guarantee with defined response and resolution times.

FAQ

Frequently asked questions

Is my data stored or logged?
No. On every plan, prompts and completions are processed in RAM and never stored. This is architectural, not a policy setting.
What's a Pendra Worker?
A Worker is the Pendra orchestration layer running on your own GPU hardware. You install it, connect it to the Pendra control plane over a secure websocket, and serve models through the same API as managed infrastructure. Requests route through Pendra's UK servers to your hardware — processed in RAM and never stored.
What models can I run?
Pendra serves open-weight models — Llama, Mistral, Qwen, DeepSeek, and others. We don't offer proprietary models like GPT, Claude or Gemini. Open-weight on sovereign infrastructure means full transparency and no vendor lock-in at the model layer.
How is Enterprise pricing structured?
Enterprise pricing is based on your GPU requirements, model selection, and throughput needs. Managed GPU plans are flat-rate — dedicated compute at a fixed monthly price, no per-token billing. Get in touch and we'll scope it with you.
Can I mix managed and self-hosted?
Yes. Enterprise customers can run Pendra-managed GPUs and self-hosted Workers through a single API and control plane. Route different models or workloads to different backends.
What compliance certifications does Pendra hold?
Pendra is UK GDPR compliant. Cyber Essentials and ISO 27001 certifications are in progress. We provide DPAs and DPIA support as standard on Enterprise plans.
Can I upgrade or downgrade at any time?
Yes. Move between Free and Pro at any time. Enterprise transitions are handled with your account manager.

Not sure which plan fits?

Book a 15-minute call. We'll help you figure out the right setup for your workload and compliance requirements.

Talk to Us