Gemma 3 12B

Name: Gemma 3 12B
Author: Google

Gemma Terms of Use

Google · 12B · transformer-decoder

🤗 HuggingFace Ollama Official

2025-03-12 131K context 12B params

Use Cases

chat code reasoning multilingual vision math summary

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	10.5 GB	Good	—
Q8_0	8	16.0 GB	Excellent	—
F16	16	28.0 GB	Excellent	—

About this model

Gemma 3 12B is the sweet spot of the Gemma 3 family — multimodal, 128K context, and strong enough to compete with models twice its size. It's one of the most popular models on Ollama with tens of millions of pulls. At Q4, it fits comfortably on 12-16 GB GPUs and delivers excellent results for conversation, coding, reasoning, and image understanding. A strong all-rounder for anyone with mid-range hardware.

Benchmarks

76.0

mmlu

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run gemma3:12b-it-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 12B
Architecture: transformer-decoder
Context: 131K tokens
Min VRAM: 10.5 GB
Recommended: 10.5 GB
Family: Gemma 3
Released: 2025-03-12
License: Gemma Terms of Use