Skip to content

NVIDIA GeForce GTX 1080 Ti

NVIDIA · 11GB GDDR5X · Can run 36 models

Buy Amazon
Manufacturer NVIDIA
VRAM 11 GB
Memory Type GDDR5X
Architecture Pascal
CUDA Cores 3,584
Bandwidth 484 GB/s
TDP 250W
MSRP $699
Released Mar 10, 2017

AI Notes

The GTX 1080 Ti is one of the most popular budget AI cards on the used market. Its 11GB VRAM handles 7B models well and can run 13B with aggressive quantization (Q4). While it lacks tensor cores, its raw CUDA throughput is still respectable for inference. Available for under $250 used, it's one of the cheapest ways to get 11GB VRAM.

Compatible Models

Model Parameters Best Quant VRAM Used Fit Est. Speed
Qwen 3 0.6B 600M Q4_K_M 2.5 GB Runs ~194 tok/s
Gemma 3 1B 1B Q8_0 2 GB Runs ~242 tok/s
Llama 3.2 1B 1B Q8_0 3 GB Runs ~161 tok/s
DeepSeek R1 1.5B 1.5B Q8_0 3 GB Runs ~161 tok/s
Gemma 2 2B 2B Q8_0 4 GB Runs ~121 tok/s
Gemma 3n E2B 2B Q4_K_M 3.3 GB Runs ~147 tok/s
Llama 3.2 3B 3B Q8_0 5 GB Runs ~97 tok/s
Phi-3 Mini 3.8B 3.8B Q8_0 5.8 GB Runs ~83 tok/s
Phi-4 Mini 3.8B 3.8B Q4_K_M 4.5 GB Runs ~108 tok/s
Gemma 3 4B 4B Q4_K_M 5 GB Runs ~97 tok/s
Gemma 3n E4B 4B Q4_K_M 4.5 GB Runs ~108 tok/s
Qwen 3 4B 4B Q4_K_M 4.5 GB Runs ~108 tok/s
DeepSeek R1 7B 7B Q8_0 9 GB Runs ~54 tok/s
Falcon 3 7B 7B Q4_K_M 6.8 GB Runs ~71 tok/s
Mistral 7B 7B Q8_0 9 GB Runs ~54 tok/s
Qwen 2.5 7B 7B Q8_0 9 GB Runs ~54 tok/s
Qwen 2.5 Coder 7B 7B Q8_0 9 GB Runs ~54 tok/s
Qwen 2.5 VL 7B 7B Q4_K_M 7 GB Runs ~69 tok/s
Cogito 8B 8B Q4_K_M 7.5 GB Runs ~65 tok/s
DeepSeek R1 8B 8B Q4_K_M 7.5 GB Runs ~65 tok/s
Nemotron 3 Nano 8B 8B Q4_K_M 7.5 GB Runs ~65 tok/s
Qwen 3 8B 8B Q4_K_M 7.5 GB Runs ~65 tok/s
Falcon 3 10B 10B Q4_K_M 8.5 GB Runs ~57 tok/s
Llama 3.2 Vision 11B 11B Q4_K_M 8.5 GB Runs ~57 tok/s
Llama 3.1 8B 8B Q8_0 10 GB Runs (tight) ~48 tok/s
Mistral Nemo 12B 12B Q4_K_M 9.5 GB Runs (tight) ~51 tok/s
DeepSeek R1 14B 14B Q4_K_M 9.9 GB Runs (tight) ~49 tok/s
Phi-4 14B 14B Q4_K_M 9.9 GB Runs (tight) ~49 tok/s
Qwen 2.5 14B 14B Q4_K_M 9.9 GB Runs (tight) ~49 tok/s
Gemma 2 9B 9B Q8_0 11 GB CPU Offload ~13 tok/s
Gemma 3 12B 12B Q4_K_M 10.5 GB CPU Offload ~14 tok/s
Phi-4 Reasoning 14B 14B Q4_K_M 11 GB CPU Offload ~13 tok/s
Qwen 2.5 Coder 14B 14B Q4_K_M 12 GB CPU Offload ~12 tok/s
Qwen 3 14B 14B Q4_K_M 12 GB CPU Offload ~12 tok/s
StarCoder2 15B 15B Q4_K_M 10.5 GB CPU Offload ~14 tok/s
Codestral 22B 22B Q4_K_M 14.7 GB CPU Offload ~10 tok/s
33 model(s) are too large for this hardware.