NVIDIA GeForce RTX 3060 Ti
NVIDIA · 8GB GDDR6X · Can run 24 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 8 GB |
| Memory Type | GDDR6X |
| Architecture | Ampere |
| CUDA Cores | 4,864 |
| Bandwidth | 448 GB/s |
| TDP | 200W |
| MSRP | $399 |
| Released | Dec 1, 2020 |
AI Notes
The RTX 3060 Ti offers 8GB VRAM with significantly higher bandwidth than the RTX 3060 12GB. It is limited to 7B models with quantization due to the 8GB VRAM cap, but the 448 GB/s bandwidth means faster token generation than many 8GB cards. A solid used-market option for budget local AI.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~224 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~149 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~149 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~112 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~90 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~77 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~100 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~90 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~100 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~50 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~50 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~50 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~50 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | CPU Offload | ~45 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | CPU Offload | ~41 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | CPU Offload | ~43 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | CPU Offload | ~47 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~45 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~45 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~45 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~37 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~37 tok/s |
19
model(s) are too large for this hardware.