NVIDIA GeForce GTX 1080 Ti
NVIDIA · 11GB GDDR5X · Can run 36 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 11 GB |
| Memory Type | GDDR5X |
| Architecture | Pascal |
| CUDA Cores | 3,584 |
| Bandwidth | 484 GB/s |
| TDP | 250W |
| MSRP | $699 |
| Released | Mar 10, 2017 |
AI Notes
The GTX 1080 Ti is one of the most popular budget AI cards on the used market. Its 11GB VRAM handles 7B models well and can run 13B with aggressive quantization (Q4). While it lacks tensor cores, its raw CUDA throughput is still respectable for inference. Available for under $250 used, it's one of the cheapest ways to get 11GB VRAM.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~194 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~242 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~161 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~161 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~121 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~147 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~97 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~83 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~108 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~97 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~108 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~108 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | Runs | ~54 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~71 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | Runs | ~54 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | Runs | ~54 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | Runs | ~54 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs | ~69 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~65 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~65 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~65 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs | ~65 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | Runs | ~57 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | Runs | ~57 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | Runs (tight) | ~48 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | Runs (tight) | ~51 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | Runs (tight) | ~49 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | Runs (tight) | ~49 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | Runs (tight) | ~49 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | CPU Offload | ~13 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | CPU Offload | ~14 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | CPU Offload | ~13 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~12 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~12 tok/s |
| StarCoder2 15B | 15B | Q4_K_M | 10.5 GB | CPU Offload | ~14 tok/s |
| Codestral 22B | 22B | Q4_K_M | 14.7 GB | CPU Offload | ~10 tok/s |
33
model(s) are too large for this hardware.