Gemma 4 E2B

Name: Gemma 4 E2B
Author: Google

Apache 2.0

Google · 2B · transformer-decoder

🤗 HuggingFace Ollama Official

2026-04-02 131K context 2B params

Use Cases

chat reasoning multilingual vision

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	4.0 GB	Good	—
Q8_0	8	6.0 GB	Excellent	—

About this model

Gemma 4 E2B is Google's smallest Gemma 4 model with 2.3B effective parameters, designed for edge and mobile deployment. Despite its compact size, it supports vision input and 140+ languages with a 128K context window. At Q4 it needs just 4 GB of VRAM, making it one of the most capable models you can run on virtually any hardware — including phones and Raspberry Pi devices.

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run gemma4:e2b-it-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 2B
Architecture: transformer-decoder
Context: 131K tokens
Min VRAM: 4.0 GB
Recommended: 4.0 GB
Family: Gemma 4
Released: 2026-04-02
License: Apache 2.0