AMD Radeon RX 7900 XT
AMD · 20GB GDDR6 · Can run 50 models
Buy Amazon
| Manufacturer | AMD |
| VRAM | 20 GB |
| Memory Type | GDDR6 |
| Architecture | RDNA 3 |
| Stream Procs | 5,376 |
| Bandwidth | 800 GB/s |
| TDP | 315W |
| MSRP | $849 |
| Released | Dec 13, 2022 |
AI Notes
The RX 7900 XT offers an unusual 20GB VRAM capacity that sits between common 16GB and 24GB tiers. It can comfortably run 13B models and handle some 30B models with quantization. ROCm support continues to improve, making it a viable alternative to NVIDIA for local AI inference.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~320 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~400 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~267 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~267 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~200 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~242 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~160 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~138 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~178 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~160 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~178 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~178 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs | ~89 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~118 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs | ~89 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs | ~89 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs | ~89 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs | ~114 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~107 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~107 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs | ~80 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~107 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~107 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | Runs | ~73 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | Runs | ~94 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | Runs | ~94 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | Runs | ~76 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs | ~84 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~81 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~81 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | Runs | ~73 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~81 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | Runs | ~67 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | Runs | ~67 tok/s |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | Runs | ~47 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | Runs | ~54 tok/s |
| Devstral 24B | 24B | Q4_K_M | 17 GB | Runs | ~47 tok/s |
| Magistral Small 24B | 24B | Q4_K_M | 17 GB | Runs | ~47 tok/s |
| Mistral Small 3.1 24B | 24B | Q4_K_M | 18 GB | Runs (tight) | ~44 tok/s |
| Gemma 2 27B | 27B | Q4_K_M | 17.7 GB | Runs (tight) | ~45 tok/s |
| Gemma 3 27B | 27B | Q4_K_M | 20 GB | CPU Offload | ~12 tok/s |
| Qwen 3 30B-A3B (MoE) | 30B | Q4_K_M | 22 GB | CPU Offload | ~11 tok/s |
| Cogito 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~11 tok/s |
| DeepSeek R1 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~12 tok/s |
| Qwen 2.5 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~12 tok/s |
| Qwen 2.5 Coder 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~11 tok/s |
| Qwen 3 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~11 tok/s |
| QwQ 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~11 tok/s |
| Command R 35B | 35B | Q4_K_M | 22.5 GB | CPU Offload | ~11 tok/s |
| Mixtral 8x7B | 47B | Q4_K_M | 29.7 GB | CPU Offload | ~8 tok/s |
19
model(s) are too large for this hardware.