Gemma 4 31B

Name: Gemma 4 31B
Author: Google

Apache 2.0

Google · 31B · transformer-decoder

🤗 HuggingFace Ollama Official Paper

2026-04-02 262K context 31B params

Use Cases

chat code reasoning multilingual vision tools math writing summary

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	22.0 GB	Good	—
Q8_0	8	38.0 GB	Excellent	—
F16	16	66.0 GB	Excellent	—

About this model

Gemma 4 31B is the flagship of the Gemma 4 family — a 30.7B dense model that ranks #3 on Arena AI among all open models, outcompeting models with 20x more parameters. It scores 84.3% on GPQA Diamond, 89.2% on AIME 2026, and 80.0% on LiveCodeBench. At Q4 it needs about 22 GB VRAM, fitting on a RTX 3090/4090/5090 or a Mac with 24 GB+ unified memory. Released under Apache 2.0, it's one of the most permissively licensed frontier-class open models available.

Benchmarks

85.2

mmlu_pro

84.3

gpqa_diamond

89.2

aime2026

80.0

livecodebench

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run gemma4:31b-it-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 31B
Architecture: transformer-decoder
Context: 262K tokens
Min VRAM: 22.0 GB
Recommended: 22.0 GB
Family: Gemma 4
Released: 2026-04-02
License: Apache 2.0