NVIDIA GeForce RTX 4070 Ti

NVIDIA · 12GB GDDR6X · Can run 16 models

Manufacturer NVIDIA
VRAM 12 GB
Memory Type GDDR6X
Architecture Ada Lovelace
CUDA Cores 7,680
Tensor Cores 240
TDP 285W
MSRP $799
Released Jan 5, 2023

AI Notes

The RTX 4070 Ti provides solid AI inference capability with 12GB of GDDR6X VRAM. It can run 7B-parameter models at full precision and 13B models with quantization. The 12GB VRAM limit means larger models require aggressive quantization or offloading to system RAM.

Compatible Models

Model Parameters Best Quant VRAM Used Fit
Llama 3.2 1B 1B Q8_0 3 GB Runs
Gemma 2 2B 2B Q8_0 4 GB Runs
Llama 3.2 3B 3B Q8_0 5 GB Runs
Phi-3 Mini 3.8B 3.8B Q8_0 5.8 GB Runs
DeepSeek R1 7B 7B Q8_0 9 GB Runs
Mistral 7B 7B Q8_0 9 GB Runs
Qwen 2.5 7B 7B Q8_0 9 GB Runs
Qwen 2.5 Coder 7B 7B Q8_0 9 GB Runs
Llama 3.1 8B 8B Q8_0 10 GB Runs
DeepSeek R1 14B 14B Q4_K_M 9.9 GB Runs
Phi-4 14B 14B Q4_K_M 9.9 GB Runs
Qwen 2.5 14B 14B Q4_K_M 9.9 GB Runs
Gemma 2 9B 9B Q8_0 11 GB Runs (tight)
StarCoder2 15B 15B Q8_0 17 GB CPU Offload
Codestral 22B 22B Q4_K_M 14.7 GB CPU Offload
Gemma 2 27B 27B Q4_K_M 17.7 GB CPU Offload
9 model(s) are too large for this hardware.