NVIDIA GeForce GTX 1660 Super
NVIDIA · 6GB GDDR6 · Can run 26 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 6 GB |
| Memory Type | GDDR6 |
| Architecture | Turing |
| CUDA Cores | 1,408 |
| Bandwidth | 336 GB/s |
| TDP | 125W |
| MSRP | $229 |
| Released | Oct 29, 2019 |
AI Notes
The GTX 1660 Super's 6GB VRAM limits it to smaller 7B models with tight quantization (Q4 or lower). It lacks tensor cores, so inference speed relies entirely on CUDA throughput. Widely available used at low prices, it's a minimal entry point for experimenting with local AI on a tight budget.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~134 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~168 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~112 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~112 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~84 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~102 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~67 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~75 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~67 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~75 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~75 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | CPU Offload | ~17 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~11 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | CPU Offload | ~15 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~11 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~11 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~11 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | CPU Offload | ~14 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | CPU Offload | ~14 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | CPU Offload | ~14 tok/s |
| Llama 3.1 8B | 8B | Q4_K_M | 6.3 GB | CPU Offload | ~16 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | CPU Offload | ~14 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | CPU Offload | ~14 tok/s |
| Gemma 2 9B | 9B | Q4_K_M | 6.9 GB | CPU Offload | ~15 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | CPU Offload | ~12 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | CPU Offload | ~12 tok/s |
43
model(s) are too large for this hardware.