NVIDIA GeForce RTX 2070 Super
NVIDIA · 8GB GDDR6 · Can run 35 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 8 GB |
| Memory Type | GDDR6 |
| Architecture | Turing |
| CUDA Cores | 2,560 |
| Tensor Cores | 320 |
| Bandwidth | 448 GB/s |
| TDP | 215W |
| MSRP | $499 |
| Released | Jul 9, 2019 |
AI Notes
The RTX 2070 Super offers 8GB VRAM and solid CUDA core count for its generation. It handles 7B models well with Q4 quantization and is widely available on the used market. A step up from the 2060 Super with more compute power at the same VRAM capacity.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~179 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~224 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~149 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~149 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~112 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~136 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~90 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~77 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~100 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~90 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~100 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~100 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~66 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs (tight) | ~64 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~60 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~15 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~15 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~15 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~15 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | CPU Offload | ~14 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | CPU Offload | ~12 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | CPU Offload | ~16 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | CPU Offload | ~16 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | CPU Offload | ~13 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | CPU Offload | ~14 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~14 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~14 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | CPU Offload | ~12 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~14 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~11 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~11 tok/s |
| StarCoder2 15B | 15B | Q4_K_M | 10.5 GB | CPU Offload | ~13 tok/s |
34
model(s) are too large for this hardware.