NVIDIA GeForce RTX 3080 10GB
NVIDIA · 10GB GDDR6X · Can run 25 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 10 GB |
| Memory Type | GDDR6X |
| Architecture | Ampere |
| CUDA Cores | 8,704 |
| Bandwidth | 760 GB/s |
| TDP | 320W |
| MSRP | $699 |
| Released | Sep 17, 2020 |
AI Notes
The RTX 3080 10GB offers excellent bandwidth at 760 GB/s, making it one of the fastest cards for small model inference. The 10GB VRAM limits it to 7B models comfortably and 13B with aggressive quantization. Very popular on the used market for local AI workloads.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~380 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~253 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~253 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~190 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~152 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~131 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~169 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~152 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~169 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~101 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~101 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs (tight) | ~84 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs (tight) | ~84 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs (tight) | ~84 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs (tight) | ~84 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs (tight) | ~80 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | CPU Offload | ~76 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | CPU Offload | ~69 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | CPU Offload | ~72 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~77 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~77 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~77 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~63 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~63 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | CPU Offload | ~52 tok/s |
18
model(s) are too large for this hardware.