AMD Radeon RX 9070
AMD · 16GB GDDR6 · Can run 49 models
Buy Amazon
| Manufacturer | AMD |
| VRAM | 16 GB |
| Memory Type | GDDR6 |
| Architecture | RDNA 4 |
| Stream Procs | 3,584 |
| Bandwidth | 538 GB/s |
| TDP | 250W |
| MSRP | $449 |
| Released | Jan 23, 2025 |
AI Notes
The RX 9070 brings RDNA 4 to a more accessible price with 16GB VRAM. It can run 13B models and attempt larger ones with quantization. The lower bandwidth compared to the 9070 XT means slightly slower token generation, but the VRAM capacity is identical. A compelling option if ROCm support matures for RDNA 4.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~215 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~269 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~179 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~179 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~135 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~163 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~108 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~93 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~120 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~108 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~120 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~120 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs | ~60 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~79 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs | ~60 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs | ~60 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs | ~60 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs | ~77 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~72 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~72 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs | ~54 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~72 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~72 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | Runs | ~49 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | Runs | ~63 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | Runs | ~63 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | Runs | ~51 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs | ~57 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~54 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~54 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | Runs | ~49 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs | ~54 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | Runs | ~45 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | Runs | ~45 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | Runs (tight) | ~37 tok/s |
| StarCoder2 15B | 15B | Q8_0 | 17 GB | CPU Offload | ~10 tok/s |
| Devstral 24B | 24B | Q4_K_M | 17 GB | CPU Offload | ~10 tok/s |
| Magistral Small 24B | 24B | Q4_K_M | 17 GB | CPU Offload | ~10 tok/s |
| Mistral Small 3.1 24B | 24B | Q4_K_M | 18 GB | CPU Offload | ~9 tok/s |
| Gemma 2 27B | 27B | Q4_K_M | 17.7 GB | CPU Offload | ~9 tok/s |
| Gemma 3 27B | 27B | Q4_K_M | 20 GB | CPU Offload | ~8 tok/s |
| Qwen 3 30B-A3B (MoE) | 30B | Q4_K_M | 22 GB | CPU Offload | ~7 tok/s |
| Cogito 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~8 tok/s |
| DeepSeek R1 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 32B | 32B | Q4_K_M | 20.7 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 Coder 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~7 tok/s |
| Qwen 3 32B | 32B | Q4_K_M | 23 GB | CPU Offload | ~7 tok/s |
| QwQ 32B | 32B | Q4_K_M | 21.5 GB | CPU Offload | ~8 tok/s |
| Command R 35B | 35B | Q4_K_M | 22.5 GB | CPU Offload | ~7 tok/s |
20
model(s) are too large for this hardware.