Llama 3.1 405B

by Meta · llama-3 family

405B

parameters

text-generation code-generation reasoning multilingual tool-use math creative-writing summarization

Llama 3.1 405B is the largest and most capable model in the Llama family, representing Meta's flagship open-source release. It competes directly with leading proprietary models on benchmarks spanning reasoning, coding, math, and multilingual understanding. Running this model locally requires enterprise-grade hardware with multiple high-end GPUs. However, for users with the necessary infrastructure, it provides state-of-the-art open-source performance without any API dependencies.

Quick Start with Ollama

ollama run 405b-instruct-q4_K_M
Creator Meta
Parameters 405B
Architecture transformer-decoder
Context Length 128K tokens
License Llama 3.1 Community License
Released Jul 23, 2024
Ollama llama3.1:405b

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M recommended 196 GB 244.5 GB
405b-instruct-q4_K_M
Q5_K_M 228.8 GB 285 GB
405b-instruct-q5_K_M

Compatible Hardware for Q4_K_M

Showing compatibility for the recommended quantization (Q4_K_M, 244.5 GB VRAM).

Compatible Hardware

Hardware VRAM Type Fit
Mac Pro M2 Ultra 192GB 192 GB mac CPU Offload
Mac Studio M4 Ultra 192GB 192 GB mac CPU Offload
34 hardware device(s) cannot run this model configuration.

Benchmark Scores

87.3
mmlu