NVIDIA GeForce GTX 1070
NVIDIA · 8GB GDDR5 · Can run 35 models
Buy Amazon
| Manufacturer | NVIDIA |
| VRAM | 8 GB |
| Memory Type | GDDR5 |
| Architecture | Pascal |
| CUDA Cores | 1,920 |
| Bandwidth | 256 GB/s |
| TDP | 150W |
| MSRP | $379 |
| Released | Jun 10, 2016 |
AI Notes
The GTX 1070 offers 8GB of GDDR5 VRAM, enough to load 7B models with Q4 quantization. It lacks tensor cores and has lower memory bandwidth than modern cards, so inference speeds will be modest. Available very cheaply on the used market, it's a reasonable starting point for trying local AI on older hardware.
Compatible Models
| Model | Parameters | Best Quant | VRAM Used | Fit | Est. Speed |
|---|---|---|---|---|---|
| Qwen 3 0.6B | 600M | Q4_K_M | 2.5 GB | Runs | ~102 tok/s |
| Gemma 3 1B | 1B | Q8_0 | 2 GB | Runs | ~128 tok/s |
| Llama 3.2 1B | 1B | Q8_0 | 3 GB | Runs | ~85 tok/s |
| DeepSeek R1 1.5B | 1.5B | Q8_0 | 3 GB | Runs | ~85 tok/s |
| Gemma 2 2B | 2B | Q8_0 | 4 GB | Runs | ~64 tok/s |
| Gemma 3n E2B | 2B | Q4_K_M | 3.3 GB | Runs | ~78 tok/s |
| Llama 3.2 3B | 3B | Q8_0 | 5 GB | Runs | ~51 tok/s |
| Phi-3 Mini 3.8B | 3.8B | Q8_0 | 5.8 GB | Runs | ~44 tok/s |
| Phi-4 Mini 3.8B | 3.8B | Q4_K_M | 4.5 GB | Runs | ~57 tok/s |
| Gemma 3 4B | 4B | Q4_K_M | 5 GB | Runs | ~51 tok/s |
| Gemma 3n E4B | 4B | Q4_K_M | 4.5 GB | Runs | ~57 tok/s |
| Qwen 3 4B | 4B | Q4_K_M | 4.5 GB | Runs | ~57 tok/s |
| Falcon 3 7B | 7B | Q4_K_M | 6.8 GB | Runs | ~38 tok/s |
| Qwen 2.5 VL 7B | 7B | Q4_K_M | 7 GB | Runs (tight) | ~37 tok/s |
| Cogito 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~34 tok/s |
| DeepSeek R1 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~34 tok/s |
| Nemotron 3 Nano 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~34 tok/s |
| Qwen 3 8B | 8B | Q4_K_M | 7.5 GB | Runs (tight) | ~34 tok/s |
| DeepSeek R1 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~8 tok/s |
| Mistral 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 Coder 7B | 7B | Q8_0 | 9 GB | CPU Offload | ~8 tok/s |
| Llama 3.1 8B | 8B | Q8_0 | 10 GB | CPU Offload | ~8 tok/s |
| Gemma 2 9B | 9B | Q8_0 | 11 GB | CPU Offload | ~7 tok/s |
| Falcon 3 10B | 10B | Q4_K_M | 8.5 GB | CPU Offload | ~9 tok/s |
| Llama 3.2 Vision 11B | 11B | Q4_K_M | 8.5 GB | CPU Offload | ~9 tok/s |
| Gemma 3 12B | 12B | Q4_K_M | 10.5 GB | CPU Offload | ~7 tok/s |
| Mistral Nemo 12B | 12B | Q4_K_M | 9.5 GB | CPU Offload | ~8 tok/s |
| DeepSeek R1 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~8 tok/s |
| Phi-4 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~8 tok/s |
| Phi-4 Reasoning 14B | 14B | Q4_K_M | 11 GB | CPU Offload | ~7 tok/s |
| Qwen 2.5 14B | 14B | Q4_K_M | 9.9 GB | CPU Offload | ~8 tok/s |
| Qwen 2.5 Coder 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~6 tok/s |
| Qwen 3 14B | 14B | Q4_K_M | 12 GB | CPU Offload | ~6 tok/s |
| StarCoder2 15B | 15B | Q4_K_M | 10.5 GB | CPU Offload | ~7 tok/s |
34
model(s) are too large for this hardware.