Skip to content

Llama 3.1 8B vs Llama 3.3 70B

Comparing VRAM requirements, performance, and capabilities for running these models locally with Ollama.

Parameters

8B

Context

128K

VRAM Range

6.3–18 GB

Recommended

Q8_0 (10 GB)

By Meta · License Llama 3.1 Community License
Parameters

70B

Context

128K

VRAM Range

43.5–72 GB

Recommended

Q4_K_M (43.5 GB)

By Meta · License Llama 3.3 Community License

VRAM Requirements by Quantization

Side-by-side memory needs at each quality level.

Quantization Llama 3.1 8B Llama 3.3 70B Difference
Q4_K_M 6.3 GB 43.5 GB -37.2 GB
Q8_0 10 GB 72 GB -62.0 GB
F16 18 GB

Capabilities

Feature support comparison.

Capability Llama 3.1 8B Llama 3.3 70B
text generation Yes Yes
code generation Yes Yes
multilingual Yes Yes
tool use Yes Yes
summarization Yes Yes
reasoning Yes
math Yes
creative writing Yes

Benchmark Scores

Higher is better. Scores from published evaluations.

Benchmark Llama 3.1 8B Llama 3.3 70B
mmlu 73.0 86.0

Hardware Compatibility

Can each model run at recommended quantization on common VRAM tiers?

VRAM Llama 3.1 8B Llama 3.3 70B
8 GB Offload No
12 GB Runs No
16 GB Runs No
24 GB Runs No
32 GB Runs Offload
48 GB Runs Tight
64 GB Runs Runs
96 GB Runs Runs

Run Llama 3.1 8B

ollama run 8b-instruct-q8_0

Run Llama 3.3 70B

ollama run 70b-instruct-q4_K_M

Check your exact hardware

Use the compatibility checker to see how each model performs on your specific GPU or Mac.

Related Comparisons