Skip to content

Best AI Models for 12 GB VRAM

The minimum for a good experience — 8B at Q8 or 14B models at Q4. Here are all 53 models you can run locally with 12 GB of memory.

Hardware with 12 GB Memory

Runs Comfortably (38)

These models fit with room to spare for context window and OS overhead.

Model Params Quantization VRAM Quality
Mistral Nemo 12B 12B Q4_K_M 9.5 GB 4
Llama 3.2 Vision 11B 11B Q4_K_M 8.5 GB 4
Falcon 3 10B 10B Q4_K_M 8.5 GB 4
Qwen 3.5 9B 9B Q4_K_M 7.5 GB 4
Yi Coder 9B 9B Q4_K_M 8 GB 4
DeepSeek R1 8B 8B Q4_K_M 7.5 GB 4
Dolphin 3 8B 8B Q8_0 10 GB 5
Granite 3.3 8B 8B Q8_0 10 GB 4
Llama 3.1 8B 8B Q8_0 10 GB 4
Nous Hermes 2 8B 8B Q8_0 10 GB 5
Qwen 3 8B 8B Q4_K_M 7.5 GB 4
Codestral Mamba 7B 7B Q8_0 9.9 GB 5
DeepSeek R1 7B 7B Q8_0 9 GB 4
Falcon 3 7B 7B Q8_0 10 GB 4
InternLM 2.5 7B 7B Q8_0 9 GB 5
Mistral 7B 7B Q8_0 9 GB 4
OpenChat 3.5 7B 7B Q8_0 9.9 GB 5
Qwen 2.5 7B 7B Q8_0 9 GB 4
Qwen 2.5 Coder 7B 7B Q8_0 9 GB 4
StarCoder2 7B 7B Q8_0 9 GB 5
WizardLM 2 7B 7B Q8_0 9.9 GB 5
Yi 1.5 6B 6B Q8_0 8 GB 5
Gemma 3 4B 4B Q8_0 7.5 GB 4
Gemma 4 E4B 4B Q8_0 10 GB 5
Qwen 3.5 4B 4B Q8_0 6.5 GB 4
Phi-3 Mini 3.8B 3.8B F16 9.6 GB 5
Llama 3.2 3B 3B F16 8 GB 5
StarCoder2 3B 3B F16 8 GB 5
Gemma 2 2B 2B F16 6 GB 5
Gemma 3n E2B 2B F16 6.5 GB 5
Gemma 4 E2B 2B Q8_0 6 GB 5
Qwen 3.5 2B 2B Q8_0 4.5 GB 4
SmolLM2 1.7B 1.7B F16 4.4 GB 5
DeepSeek R1 1.5B 1.5B F16 5 GB 5
Gemma 3 1B 1B F16 3.5 GB 5
Llama 3.2 1B 1B F16 4 GB 5
Qwen 3.5 0.8B 0.8B Q8_0 2 GB 4
Qwen 3 0.6B 0.6B F16 3.3 GB 5

Tight Fit (15)

These models run but with limited context window. Close other apps to free memory.

Model Params Quantization VRAM Quality
StarCoder2 15B 15B Q4_K_M 10.5 GB 3
DeepSeek R1 14B 14B Q5_K_M 11.3 GB 4
Phi-4 14B 14B Q5_K_M 11.3 GB 4
Phi-4 Reasoning 14B 14B Q4_K_M 11 GB 4
Qwen 2.5 14B 14B Q5_K_M 11.3 GB 4
Gemma 3 12B 12B Q4_K_M 10.5 GB 4
Gemma 2 9B 9B Q8_0 11 GB 4
Yi 1.5 9B 9B Q8_0 11 GB 5
Aya Expanse 8B 8B Q8_0 10.5 GB 5
Cogito 8B 8B Q8_0 11 GB 4
Nemotron 3 Nano 8B 8B Q8_0 11 GB 4
Qwen 2.5 VL 7B 7B Q8_0 10.5 GB 4
Gemma 3n E4B 4B F16 11 GB 5
Qwen 3 4B 4B F16 11 GB 5
Phi-4 Mini 3.8B 3.8B F16 10.5 GB 5

Want to check a specific combination?

Use the compatibility checker to see exactly how a model runs on your specific hardware, with performance estimates.

Open Compatibility Checker
8 GB Entry-level local AI 16 GB The sweet spot 24 GB Serious local AI 32 GB Premium tier