All models
Model card on Hugging Face
Google DeepMind’s efficient multimodal family
Gemma 4
by Google DeepMind
Chat Vision Tools Thinking
Gemma 4 pairs vision, reasoning and tool use with compact sizes (including edge-friendly E2B/E4B variants) and a 256K context window.
- Publisher
- Google DeepMind
- Context window
- 256K tokens
- Sizes
- 12B, 26B, 31B, E2B, E4B
- Licence
- Apache 2.0
Run Gemma 4
Install it on a Pendra worker, then call it through the OpenAI-compatible API with a pdr_sk_ key.
Chat
from pendra import Pendra
client = Pendra(api_key="pdr_sk_...")
response = client.chat.completions.create(
model="gemma4:12b",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content) Vision
from pendra import Pendra
client = Pendra(api_key="pdr_sk_...")
response = client.chat.completions.create(
model="gemma4:12b",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": "https://example.com/photo.jpg"}},
],
}
],
)
print(response.choices[0].message.content) Run Gemma 4 on your own UK infrastructure
Deploy a worker, install Gemma 4, and start serving it through one sovereign API endpoint.