Gemma 3 12B
by Google · gemma-3 family
12B
parameters
text-generation code-generation reasoning multilingual vision math summarization
Gemma 3 12B is the sweet spot of the Gemma 3 family — multimodal, 128K context, and strong enough to compete with models twice its size. It's one of the most popular models on Ollama with tens of millions of pulls. At Q4, it fits comfortably on 12-16 GB GPUs and delivers excellent results for conversation, coding, reasoning, and image understanding. A strong all-rounder for anyone with mid-range hardware.
Quick Start with Ollama
ollama run 12b-it-q4_K_M | Creator | |
| Parameters | 12B |
| Architecture | transformer-decoder |
| Context | 128K tokens |
| Released | Mar 12, 2025 |
| License | Gemma Terms of Use |
| Ollama | gemma3:12b |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 8.1 GB | 10.5 GB |
★
★
★
★
★
| 12b-it-q4_K_M |
| Q8_0 | 13 GB | 16 GB |
★
★
★
★
★
| 12b-it-q8_0 |
| F16 | 24 GB | 28 GB |
★
★
★
★
★
| 12b-it-fp16 |
Compatible Hardware
for Q4_K_M (10.5 GB VRAM)
Compatible Hardware
| Hardware | VRAM | Type | Fit | Est. Speed |
|---|---|---|---|---|
| Mac Studio M4 Ultra 512GB | 512 GB | mac | Runs | ~78 tok/s |
| Mac Pro M2 Ultra 192GB | 192 GB | mac | Runs | ~76 tok/s |
| Mac Studio M4 Ultra 192GB | 192 GB | mac | Runs | ~78 tok/s |
| Mac Studio M4 Max 128GB | 128 GB | mac | Runs | ~52 tok/s |
| MacBook Pro M4 Max 128GB | 128 GB | mac | Runs | ~52 tok/s |
| MacBook Pro M3 Max 96GB | 96 GB | mac | Runs | ~38 tok/s |
| Mac mini M4 Pro 64GB | 64 GB | mac | Runs | ~26 tok/s |
| Mac Studio M4 Max 64GB | 64 GB | mac | Runs | ~52 tok/s |
| MacBook Pro M4 Max 64GB | 64 GB | mac | Runs | ~52 tok/s |
| Mac mini M4 Pro 48GB | 48 GB | mac | Runs | ~26 tok/s |
| MacBook Pro M3 Max 48GB | 48 GB | mac | Runs | ~38 tok/s |
| MacBook Pro M4 Max 48GB | 48 GB | mac | Runs | ~52 tok/s |
| MacBook Pro M4 Pro 48GB | 48 GB | mac | Runs | ~26 tok/s |
| Mac Studio M4 Max 36GB | 36 GB | mac | Runs | ~52 tok/s |
| MacBook Pro M3 Pro 36GB | 36 GB | mac | Runs | ~14 tok/s |
| NVIDIA GeForce RTX 5090 | 32 GB | gpu | Runs | ~171 tok/s |
| iMac M4 32GB | 32 GB | mac | Runs | ~11 tok/s |
| Mac mini M4 32GB | 32 GB | mac | Runs | ~11 tok/s |
| MacBook Air M4 32GB | 32 GB | mac | Runs | ~11 tok/s |
| AMD Radeon RX 7900 XTX | 24 GB | gpu | Runs | ~91 tok/s |
| NVIDIA GeForce RTX 3090 | 24 GB | gpu | Runs | ~89 tok/s |
| NVIDIA GeForce RTX 4090 | 24 GB | gpu | Runs | ~96 tok/s |
| iMac M3 24GB | 24 GB | mac | Runs | ~10 tok/s |
| Mac mini M2 24GB | 24 GB | mac | Runs | ~10 tok/s |
| Mac mini M4 Pro 24GB | 24 GB | mac | Runs | ~26 tok/s |
| MacBook Air M2 24GB | 24 GB | mac | Runs | ~10 tok/s |
| MacBook Air M4 24GB | 24 GB | mac | Runs | ~11 tok/s |
| MacBook Pro M4 Pro 24GB | 24 GB | mac | Runs | ~26 tok/s |
| AMD Radeon RX 7900 XT | 20 GB | gpu | Runs | ~76 tok/s |
| MacBook Pro M3 Pro 18GB | 18 GB | mac | Runs | ~14 tok/s |
| AMD Radeon RX 6800 XT | 16 GB | gpu | Runs | ~49 tok/s |
| AMD Radeon RX 7800 XT | 16 GB | gpu | Runs | ~59 tok/s |
| Intel Arc A770 | 16 GB | gpu | Runs | ~53 tok/s |
| NVIDIA GeForce RTX 4060 Ti 16GB | 16 GB | gpu | Runs | ~27 tok/s |
| NVIDIA GeForce RTX 4070 Ti Super | 16 GB | gpu | Runs | ~64 tok/s |
| NVIDIA GeForce RTX 4080 Super | 16 GB | gpu | Runs | ~70 tok/s |
| NVIDIA GeForce RTX 4080 | 16 GB | gpu | Runs | ~68 tok/s |
| NVIDIA GeForce RTX 5070 Ti | 16 GB | gpu | Runs | ~85 tok/s |
| NVIDIA GeForce RTX 5080 | 16 GB | gpu | Runs | ~91 tok/s |
| iMac M1 16GB | 16 GB | mac | Runs | ~6 tok/s |
| iMac M4 16GB | 16 GB | mac | Runs | ~11 tok/s |
| Mac mini M1 16GB | 16 GB | mac | Runs | ~6 tok/s |
| Mac mini M4 16GB | 16 GB | mac | Runs | ~11 tok/s |
| MacBook Air M2 16GB | 16 GB | mac | Runs | ~10 tok/s |
| MacBook Air M3 16GB | 16 GB | mac | Runs | ~10 tok/s |
| MacBook Air M4 16GB | 16 GB | mac | Runs | ~11 tok/s |
| MacBook Pro M1 16GB | 16 GB | mac | Runs | ~6 tok/s |
| MacBook Pro M2 Pro 16GB | 16 GB | mac | Runs | ~19 tok/s |
| AMD Radeon RX 7700 XT | 12 GB | gpu | Runs (tight) | ~41 tok/s |
| NVIDIA GeForce RTX 3060 12GB | 12 GB | gpu | Runs (tight) | ~34 tok/s |
| NVIDIA GeForce RTX 3080 12GB | 12 GB | gpu | Runs (tight) | ~87 tok/s |
| NVIDIA GeForce RTX 4070 Super | 12 GB | gpu | Runs (tight) | ~48 tok/s |
| NVIDIA GeForce RTX 4070 Ti | 12 GB | gpu | Runs (tight) | ~48 tok/s |
| NVIDIA GeForce RTX 4070 | 12 GB | gpu | Runs (tight) | ~48 tok/s |
| NVIDIA GeForce RTX 5070 | 12 GB | gpu | Runs (tight) | ~64 tok/s |
| NVIDIA GeForce RTX 2080 Ti | 11 GB | gpu | CPU Offload | ~59 tok/s |
| NVIDIA GeForce RTX 3080 10GB | 10 GB | gpu | CPU Offload | ~72 tok/s |
| AMD Radeon RX 7600 | 8 GB | gpu | CPU Offload | ~27 tok/s |
| Intel Arc A750 | 8 GB | gpu | CPU Offload | ~49 tok/s |
| NVIDIA GeForce RTX 3060 Ti | 8 GB | gpu | CPU Offload | ~43 tok/s |
| NVIDIA GeForce RTX 3070 | 8 GB | gpu | CPU Offload | ~43 tok/s |
| NVIDIA GeForce RTX 4060 Ti 8GB | 8 GB | gpu | CPU Offload | ~27 tok/s |
| NVIDIA GeForce RTX 4060 | 8 GB | gpu | CPU Offload | ~26 tok/s |
| MacBook Air M1 8GB | 8 GB | mac | CPU Offload | ~6 tok/s |
| MacBook Air M2 8GB | 8 GB | mac | CPU Offload | ~10 tok/s |
Benchmark Scores
76.0
mmlu