Gemma 4 26B
by Google · gemma-4 family
26B
parameters
text-generation code-generation reasoning multilingual vision tool-use math
Gemma 4 26B is a Mixture-of-Experts model with 26B total parameters but only 3.8B active per token, giving it exceptional efficiency. It ranks #6 on Arena AI among open models and scores 88.3% on AIME 2026 — remarkable for its active parameter count. The MoE architecture means it fits in ~20 GB VRAM at Q4 while delivering reasoning quality that rivals much larger dense models. Supports 256K context with native vision and tool use.
Quick Start with Ollama
ollama run 26b-a4b-it-q4_K_M | Creator | |
| Parameters | 26B |
| Architecture | transformer-decoder |
| Context | 256K tokens |
| Released | Apr 2, 2026 |
| License | Apache 2.0 |
| Ollama | gemma4:26b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 18 GB | 20 GB | | 26b-a4b-it-q4_K_M |
| Q8_0 | 28 GB | 30 GB | | 26b-a4b-it-q8_0 |
Compatible Hardware
Q4_K_M requires 20 GB VRAM
Benchmark Scores
88.3
aime2026
77.1
livecodebench