Llama 4 Maverick
by Meta · llama-4 family
400B
parameters
text-generation code-generation reasoning multilingual vision math tool-use creative-writing summarization
Llama 4 Maverick is Meta's flagship mixture-of-experts model with 400B total parameters (17B active per token) across 128 experts. It features native multimodal support for text and image inputs, a 1M token context window, and strong performance across reasoning, coding, and multilingual tasks. Maverick delivers competitive results against top proprietary models while being open-weight. Its massive expert count enables broad knowledge coverage, though the full 400B parameter count means all expert weights must be loaded into memory for inference.
Quick Start with Ollama
ollama run maverick-q4_K_M | Creator | Meta |
| Parameters | 400B |
| Architecture | mixture-of-experts |
| Context | 1024K tokens |
| Released | Apr 5, 2025 |
| License | Llama 4 Community License |
| Ollama | llama4:maverick |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 220 GB | 228 GB | | maverick-q4_K_M |
| Q8_0 | 400 GB | 410 GB | | maverick-q8_0 |
Compatible Hardware
Q4_K_M requires 228 GB VRAM
Benchmark Scores
82.0
mmlu