AMD Radeon RX 6900 XT
AMD · 16GB GDDR6 · Can run 49 models
Buy Amazon
| Manufacturer | AMD |
| VRAM | 16 GB |
| Memory Type | GDDR6 |
| Architecture | RDNA 2 |
| Stream Procs | 5,120 |
| Bandwidth | 512 GB/s |
| TDP | 300W |
| MSRP | $999 |
| Released | Dec 8, 2020 |
AI Notes
The RX 6900 XT is AMD's top RDNA 2 card with 16GB VRAM and 512 GB/s bandwidth. It matches the RX 6800 XT in memory capacity with higher compute performance. Can run 13B models comfortably and attempt larger models with quantization. Good value on the used market with mature ROCm support.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~205 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~256 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~171 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~171 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~128 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~155 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~102 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~88 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~114 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~102 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~114 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~114 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs | ~57 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~75 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs | ~57 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs | ~57 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs | ~57 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs | ~73 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~68 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~68 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs | ~51 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~68 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~68 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | Runs | ~47 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | Runs | ~60 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | Runs | ~60 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | Runs | ~49 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs | ~54 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~52 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~52 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | Runs | ~47 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~52 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | Runs | ~43 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | Runs | ~43 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | Runs (tight) | ~35 tok/s |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | CPU Offload | ~9 tok/s |
| Devstral 24B | 24B | Q4_K_M | 17 GB | CPU Offload | ~9 tok/s |
| Magistral Small 24B | 24B | Q4_K_M | 17 GB | CPU Offload | ~9 tok/s |
| Mistral Small 3.1 24B | 24B | Q4_K_M | 18 GB | CPU Offload | ~8 tok/s |
| Gemma 2 27B | 27B | Q4_K_M | 17.7 GB | CPU Offload | ~9 tok/s |
| Gemma 3 27B | 27B | Q4_K_M | 20 GB | CPU Offload | ~8 tok/s |
| Qwen 3 30B-A3B (MoE) | 30B | Q4_K_M | 22 GB | CPU Offload | ~7 tok/s |
| Cogito 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~7 tok/s |
| DeepSeek R1 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 Coder 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~7 tok/s |
| Qwen 3 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~7 tok/s |
| QwQ 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~7 tok/s |
| Command R 35B | 35B | Q4_K_M | 22.5 GB | CPU Offload | ~7 tok/s |
20
model(s) are too large for this hardware.