Gemma 4 E4B

Name: Gemma 4 E4B
Author: Google

Apache 2.0

Google · 4B · transformer-decoder

🤗 HuggingFace Ollama Official

2026-04-02 131K context 4B params

Use Cases

chat code reasoning multilingual vision

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	6.0 GB	Good	—
Q8_0	8	10.0 GB	Excellent	—

About this model

Gemma 4 E4B is Google's efficient 4.5B-effective-parameter multimodal model, ideal for laptops and consumer devices. It delivers a significant quality leap over its predecessor Gemma 3n E4B while maintaining a similar footprint. Supports text and vision inputs with 128K context. At Q4 it needs about 6 GB of VRAM, fitting comfortably on entry-level GPUs and 8 GB Macs.

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run gemma4:e4b-it-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 4B
Architecture: transformer-decoder
Context: 131K tokens
Min VRAM: 6.0 GB
Recommended: 6.0 GB
Family: Gemma 4
Released: 2026-04-02
License: Apache 2.0