Security Whitepaper

Sovereign AI infrastructure
for regulated industries

This document outlines the security architecture, data handling practices, and compliance posture of Pendra — the managed AI infrastructure powering platforms that ship into healthcare, legal, public sector and financial services.

Last updated: May 2026 | Version 1.1

1. Executive Summary

Pendra provides managed AI inference — chat, embeddings, image generation, and audio — to platforms that build for organisations handling sensitive, regulated, or classified data. The infrastructure exists specifically to eliminate the legal and technical risks of routing that data through foreign-owned cloud platforms.

Every component of the Pendra stack (compute, networking, storage, and operations) resides within the United Kingdom, is owned by a Welsh-incorporated entity, and is operated exclusively by vetted UK personnel. Customer data is subject only to UK law, with no exposure to foreign intelligence legislation such as the US CLOUD Act, FISA Section 702, or equivalent frameworks.

Pendra is built for the platforms behind NHS trusts, government bodies, legal firms, and financial services — wherever "probably compliant" is not an acceptable posture.

2. Data Sovereignty & Residency

Data sovereignty is the founding principle of Pendra. All customer data — prompts, completions, embeddings, image inputs, audio inputs, and metadata — is processed and transmitted exclusively within the United Kingdom.

Sovereignty Guarantees

  • 01 All physical compute hardware is located in UK data centres
  • 02 Pendra AI Ltd is incorporated in Wales
  • 03 No data is transmitted to, processed in, or accessible from any non-UK jurisdiction
  • 04 No US-headquartered company has ownership, operational control, or legal access to any Pendra system
  • 05 All disputes and legal proceedings are governed by the laws of England and Wales

This structure ensures that the US CLOUD Act, FISA Section 702, Executive Order 12333, and equivalent foreign intelligence frameworks have no legal mechanism to compel access to data processed by Pendra.

3. Deployment Models

Pendra supports two deployment topologies. Both share the same orchestration layer, API surface, and zero-retention guarantees, but they differ in where inference physically executes.

Pendra-managed workers

We run dedicated GPU workers in UK data centres. Inference executes on our hardware; no infrastructure work falls to you. Best for platforms that want fully managed compliance.

Self-hosted workers

Run Pendra Workers on your own GPUs, inside your own network. Inference data never reaches Pendra-managed infrastructure — only orchestration metadata transits our API. Strongest data-isolation posture available.

Self-hosted workers connect outbound to the Pendra API over WebSocket — no inbound ports need to be exposed from your network. Workers authenticate with rotating Ed25519 JWTs (30 second TTL, 60 second maximum clock skew) verified against a per-organisation public-key registry.

4. Infrastructure Security

Pendra operates dedicated GPU workers within Tier 3+ UK data centres. Workers are isolated at the process and container level, with no shared inference state between organisations.

Physical Security

  • 24/7 on-site security with biometric access controls
  • CCTV monitoring with 90-day retention
  • Mantrap entry systems with dual-authentication
  • Environmental controls: redundant power (N+1), fire suppression, flood detection

Network Security

  • All API traffic encrypted with TLS 1.3 (minimum)
  • Network segmentation between inference workers, management plane, and public endpoints
  • DDoS mitigation at the network edge
  • No inbound SSH. Management access via hardened bastion with MFA and audit logging

Workload Isolation

  • Each worker runs inference in a dedicated process with no shared state across organisations
  • Worker registry uses atomic, Redis-backed dispatch to prevent cross-tenant request leakage
  • Prompts and completions are held only in process memory for the duration of a single request

5. Data Handling & Zero-Retention

Pendra operates a strict zero-retention policy for inference data. We do not store, log, cache, or inspect the content of prompts, completions, embeddings, image inputs, or audio inputs.

Data Lifecycle

RECEIVE Encrypted API request arrives at a UK endpoint over TLS 1.3
DISPATCH Request is routed in-memory to the least-busy worker holding the requested model
PROCESS Worker performs inference on the prompt; intermediate state lives only in process memory
RESPOND Completion is returned to the caller over the same TLS-protected connection
DISCARD In-memory prompt, completion, and intermediate tensors are released. No artefact is persisted to disk.

We log only operational metadata: timestamp, model ID, token counts, latency, request type, and HTTP status. This metadata contains no prompt or completion content and is used solely for billing, capacity planning, and system health monitoring.

We never use customer data for model training, fine-tuning, evaluation, or any purpose beyond fulfilling the immediate inference request.

6. Access Control & Authentication

All API access is authenticated using API keys with the prefix pdr_sk_. Keys are hashed with SHA-256 before storage. Pendra never stores plaintext API keys.

Key Management

  • API keys are generated with cryptographically secure random bytes
  • Keys can be scoped with optional expiration dates
  • Key usage is tracked (last used timestamp, request counts)
  • Keys can be revoked instantly via the dashboard
  • The full key is shown only once at creation time

Dashboard Authentication

  • Dashboard access requires email-based authentication with HS256-signed JWTs
  • Session tokens are short-lived and non-persistent
  • Optional Google OAuth for organisations that prefer SSO

Worker Keys

GPU workers authenticate to the Pendra API using Ed25519 keypairs registered against a single organisation. Only the public half is ever transmitted to or stored by Pendra; the private key never leaves the worker host.

  • Keypairs generated locally; private half never leaves the worker
  • Public keys scoped to a single organisation
  • Workers sign short-lived JWTs — 30s TTL, 60s clock skew
  • Instant revocation via the dashboard
  • Connect, disconnect, and health events audited per worker

Rate Limiting

Per-key rate limiting prevents abuse and ensures fair resource allocation across customers. Rate limit responses use standard HTTP 429 status codes with Retry-After headers.

7. Encryption

In Transit

  • TLS 1.3 enforced on all API endpoints. TLS 1.2 and below are rejected
  • HSTS headers with long max-age to prevent protocol downgrade attacks
  • Certificate transparency logging enabled
  • Worker-to-API WebSocket connections are wrapped in TLS and gated by Ed25519 JWT

At Rest

  • Inference data is not stored at rest (zero-retention policy)
  • Operational metadata and account data encrypted with AES-256
  • API key hashes stored using SHA-256 one-way hashing
  • Database backups encrypted and stored within UK jurisdiction

8. Compliance & Certifications

UK GDPR

Fully compliant with UK General Data Protection Regulation. Data processed exclusively within UK jurisdiction as a data processor under standard DPAs.

Cyber Essentials Plus (In Progress)

Certification in progress under the UK Government-backed Cyber Essentials Plus scheme — an audited extension of Cyber Essentials that includes hands-on vulnerability testing of internet-facing systems and end-user devices.

ISO 27001 (In Progress)

Information Security Management System (ISMS) aligned with ISO 27001 requirements. Certification in progress.

Pro customers are covered by our standard Data Processing Agreement. Enterprise customers receive a custom DPA and Data Protection Impact Assessment (DPIA) support tailored to their deployment.

9. Incident Response

Pendra maintains a documented Incident Response Plan (IRP) with defined severity levels, escalation paths, and communication procedures.

  • Detection: Automated monitoring and alerting across all infrastructure components
  • Classification: Incidents triaged by severity (P1–P4) within 15 minutes of detection
  • Notification: Affected customers notified within 72 hours of a confirmed data-related incident on a best-endeavours basis, in line with our standard DPA and UK GDPR Article 33
  • Remediation: Root cause analysis and remediation documented for every incident
  • Review: Post-incident reviews conducted within 5 business days with published findings

10. Personnel Security

All Pendra engineers with logical access to production infrastructure are:

  • UK nationals or residents with right to work in the United Kingdom
  • Background checks aligned with BS 7858 principles
  • Bound by confidentiality agreements and acceptable use policies
  • Required to use hardware security keys for all production access
  • Subject to quarterly access reviews and principle-of-least-privilege enforcement

No third party has logical access to customer inference data. Infrastructure providers operating physical hosts and managed storage have no path to plaintext customer data, which is only ever held in worker process memory for the lifetime of a single request.

11. Contact

For security inquiries, vulnerability reports, or to request a copy of our Data Processing Agreement, please contact:

Pendra AI Ltd / Security Team

contact@pendra.ai

We acknowledge all security reports within 24 hours and aim to provide an initial assessment within 72 hours.