Skip to content

Llama 3.1 70B vs Qwen 2.5 72B

Comparing VRAM requirements, performance, and capabilities for running these models locally with Ollama.

Parameters

70B

Context

128K

VRAM Range

43.5–72 GB

Recommended

Q4_K_M (43.5 GB)

By Meta · License Llama 3.1 Community License
Parameters

72B

Context

128K

VRAM Range

44.7–74 GB

Recommended

Q4_K_M (44.7 GB)

By Alibaba · License Qwen License

VRAM Requirements by Quantization

Side-by-side memory needs at each quality level.

Quantization Llama 3.1 70B Qwen 2.5 72B Difference
Q4_K_M 43.5 GB 44.7 GB -1.2 GB
Q8_0 72 GB 74 GB -2.0 GB

Capabilities

Feature support comparison.

Capability Llama 3.1 70B Qwen 2.5 72B
text generation Yes Yes
code generation Yes Yes
reasoning Yes Yes
multilingual Yes Yes
tool use Yes Yes
math Yes Yes
creative writing Yes Yes
summarization Yes Yes

Benchmark Scores

Higher is better. Scores from published evaluations.

Benchmark Llama 3.1 70B Qwen 2.5 72B
mmlu 83.6 85.3

Hardware Compatibility

Can each model run at recommended quantization on common VRAM tiers?

VRAM Llama 3.1 70B Qwen 2.5 72B
8 GB No No
12 GB No No
16 GB No No
24 GB No No
32 GB Offload Offload
48 GB Tight Tight
64 GB Runs Runs
96 GB Runs Runs

Run Llama 3.1 70B

ollama run 70b-instruct-q4_K_M

Run Qwen 2.5 72B

ollama run 72b-instruct-q4_K_M

Check your exact hardware

Use the compatibility checker to see how each model performs on your specific GPU or Mac.