Best AI Models for 192 GB VRAM
Extreme — 400B+ models, near-datacenter capability. Here are all 96 models you can run locally with 192 GB of memory.
Hardware with 192 GB Memory
Runs Comfortably (96)
These models fit with room to spare for context window and OS overhead.
| Model | Params | Quantization | VRAM | Quality |
|---|---|---|---|---|
| Nemotron Ultra 253B | 253B | Q4_K_M | 155 GB | 4 |
| Qwen 3 235B-A22B | 235B | Q4_K_M | 138 GB | 4 |
| Mixtral 8x22B | 141B | Q8_0 | 148 GB | 4 |
| Devstral 2 123B | 123B | Q8_0 | 129 GB | 4 |
| Mistral Large 2 123B | 123B | Q8_0 | 129 GB | 4 |
| Qwen 3.5 122B | 122B | Q8_0 | 135 GB | 5 |
| Command A 111B | 111B | Q8_0 | 117 GB | 5 |
| Llama 4 Scout (109B/17B active) | 109B | Q8_0 | 125 GB | 5 |
| Command R+ 104B | 104B | Q8_0 | 110 GB | 5 |
| Llama 3.2 Vision 90B | 90B | Q8_0 | 96 GB | 4 |
| Qwen 2.5 72B | 72B | Q8_0 | 74 GB | 5 |
| Qwen 2.5 VL 72B | 72B | Q8_0 | 78 GB | 4 |
| Cogito 70B | 70B | Q8_0 | 76 GB | 4 |
| DeepSeek R1 70B | 70B | Q8_0 | 72 GB | 5 |
| Llama 3.1 70B | 70B | Q8_0 | 72 GB | 5 |
| Llama 3.3 70B | 70B | Q8_0 | 72 GB | 5 |
| Dolphin Mixtral 8x7B | 47B | F16 | 94 GB | 5 |
| Mixtral 8x7B | 47B | Q8_0 | 49 GB | 5 |
| Command R 35B | 35B | Q8_0 | 37 GB | 5 |
| Qwen 3.5 35B A3B | 35B | Q8_0 | 20 GB | 5 |
| Nous Hermes 2 34B | 34B | F16 | 70 GB | 5 |
| Yi 1.5 34B | 34B | F16 | 70 GB | 5 |
| WizardCoder 33B | 33B | F16 | 69.5 GB | 5 |
| Aya Expanse 32B | 32B | Q8_0 | 38 GB | 5 |
| Cogito 32B | 32B | F16 | 68 GB | 5 |
| DeepSeek R1 32B | 32B | Q8_0 | 34 GB | 5 |
| Qwen 2.5 32B | 32B | Q8_0 | 34 GB | 5 |
| Qwen 2.5 Coder 32B | 32B | F16 | 70 GB | 5 |
| Qwen 3 32B | 32B | F16 | 70 GB | 5 |
| QwQ 32B | 32B | F16 | 68 GB | 5 |
| Gemma 4 31B | 31B | F16 | 66 GB | 5 |
| Qwen 3 30B-A3B (MoE) | 30B | F16 | 67 GB | 5 |
| Gemma 2 27B | 27B | Q8_0 | 29 GB | 5 |
| Gemma 3 27B | 27B | F16 | 60 GB | 5 |
| Qwen 3.5 27B | 27B | Q8_0 | 33 GB | 5 |
| Gemma 4 26B | 26B | Q8_0 | 30 GB | 5 |
| Devstral 24B | 24B | F16 | 53 GB | 5 |
| Magistral Small 24B | 24B | F16 | 52 GB | 5 |
| Mistral Small 3.1 24B | 24B | F16 | 54 GB | 5 |
| Codestral 22B | 22B | Q8_0 | 24 GB | 5 |
| InternLM 2.5 20B | 20B | F16 | 42 GB | 5 |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | 5 |
| DeepSeek R1 14B | 14B | Q8_0 | 16 GB | 5 |
| Phi-4 14B | 14B | Q8_0 | 16 GB | 5 |
| Phi-4 Reasoning 14B | 14B | F16 | 32 GB | 5 |
| Qwen 2.5 14B | 14B | Q8_0 | 16 GB | 5 |
| Qwen 2.5 Coder 14B | 14B | F16 | 33 GB | 5 |
| Qwen 3 14B | 14B | F16 | 33 GB | 5 |
| Gemma 3 12B | 12B | F16 | 28 GB | 5 |
| Mistral Nemo 12B | 12B | F16 | 28 GB | 5 |
| Llama 3.2 Vision 11B | 11B | F16 | 26 GB | 5 |
| Falcon 3 10B | 10B | F16 | 24 GB | 5 |
| Gemma 2 9B | 9B | F16 | 20 GB | 5 |
| Qwen 3.5 9B | 9B | Q8_0 | 11.5 GB | 5 |
| Yi 1.5 9B | 9B | F16 | 20 GB | 5 |
| Yi Coder 9B | 9B | F16 | 21 GB | 5 |
| Aya Expanse 8B | 8B | Q8_0 | 10.5 GB | 5 |
| Cogito 8B | 8B | F16 | 19.5 GB | 5 |
| DeepSeek R1 8B | 8B | F16 | 20 GB | 5 |
| Dolphin 3 8B | 8B | F16 | 18 GB | 5 |
| Granite 3.3 8B | 8B | F16 | 18 GB | 5 |
| Llama 3.1 8B | 8B | F16 | 18 GB | 5 |
| Nemotron 3 Nano 8B | 8B | F16 | 19.5 GB | 5 |
| Nous Hermes 2 8B | 8B | F16 | 18 GB | 5 |
| Qwen 3 8B | 8B | F16 | 20 GB | 5 |
| Codestral Mamba 7B | 7B | F16 | 17 GB | 5 |
| DeepSeek R1 7B | 7B | F16 | 16 GB | 5 |
| Falcon 3 7B | 7B | F16 | 17.5 GB | 5 |
| InternLM 2.5 7B | 7B | F16 | 16 GB | 5 |
| Mistral 7B | 7B | F16 | 16 GB | 5 |
| OpenChat 3.5 7B | 7B | F16 | 17 GB | 5 |
| Qwen 2.5 7B | 7B | F16 | 16 GB | 5 |
| Qwen 2.5 Coder 7B | 7B | F16 | 16 GB | 5 |
| Qwen 2.5 VL 7B | 7B | F16 | 18.5 GB | 5 |
| StarCoder2 7B | 7B | F16 | 16 GB | 5 |
| WizardLM 2 7B | 7B | F16 | 17 GB | 5 |
| Yi 1.5 6B | 6B | F16 | 14 GB | 5 |
| Gemma 3 4B | 4B | F16 | 11.5 GB | 5 |
| Gemma 3n E4B | 4B | F16 | 11 GB | 5 |
| Gemma 4 E4B | 4B | Q8_0 | 10 GB | 5 |
| Qwen 3 4B | 4B | F16 | 11 GB | 5 |
| Qwen 3.5 4B | 4B | Q8_0 | 6.5 GB | 4 |
| Phi-3 Mini 3.8B | 3.8B | F16 | 9.6 GB | 5 |
| Phi-4 Mini 3.8B | 3.8B | F16 | 10.5 GB | 5 |
| Llama 3.2 3B | 3B | F16 | 8 GB | 5 |
| StarCoder2 3B | 3B | F16 | 8 GB | 5 |
| Gemma 2 2B | 2B | F16 | 6 GB | 5 |
| Gemma 3n E2B | 2B | F16 | 6.5 GB | 5 |
| Gemma 4 E2B | 2B | Q8_0 | 6 GB | 5 |
| Qwen 3.5 2B | 2B | Q8_0 | 4.5 GB | 4 |
| SmolLM2 1.7B | 1.7B | F16 | 4.4 GB | 5 |
| DeepSeek R1 1.5B | 1.5B | F16 | 5 GB | 5 |
| Gemma 3 1B | 1B | F16 | 3.5 GB | 5 |
| Llama 3.2 1B | 1B | F16 | 4 GB | 5 |
| Qwen 3.5 0.8B | 0.8B | Q8_0 | 2 GB | 4 |
| Qwen 3 0.6B | 0.6B | F16 | 3.3 GB | 5 |
Want to check a specific combination?
Use the compatibility checker to see exactly how a model runs on your specific hardware, with performance estimates.
Open Compatibility Checker