chat code reasoning multilingual vision tools math
Quantization Options
Quant
Bits
VRAM
Quality
Status
Q4_K_Mrec
4
20.0 GB
Good
—
Q8_0
8
30.0 GB
Excellent
—
About this model
Gemma 4 26B is a Mixture-of-Experts model with 26B total parameters but only 3.8B active per token, giving it exceptional efficiency. It ranks #6 on Arena AI among open models and scores 88.3% on AIME 2026 — remarkable for its active parameter count.
The MoE architecture means it fits in ~20 GB VRAM at Q4 while delivering reasoning quality that rivals much larger dense models. Supports 256K context with native vision and tool use.