Skip to content

Llama 4 Maverick

by Meta · llama-4 family

400B

parameters

text-generation code-generation reasoning multilingual vision math tool-use creative-writing summarization

Llama 4 Maverick is Meta's flagship mixture-of-experts model with 400B total parameters (17B active per token) across 128 experts. It features native multimodal support for text and image inputs, a 1M token context window, and strong performance across reasoning, coding, and multilingual tasks. Maverick delivers competitive results against top proprietary models while being open-weight. Its massive expert count enables broad knowledge coverage, though the full 400B parameter count means all expert weights must be loaded into memory for inference.

Quick Start with Ollama

ollama run maverick-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Meta
Parameters 400B
Architecture mixture-of-experts
Context 1024K tokens
Released Apr 5, 2025
License Llama 4 Community License
Ollama llama4:maverick

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 220 GB 228 GB maverick-q4_K_M
Q8_0 400 GB 410 GB maverick-q8_0

Compatible Hardware

Q4_K_M requires 228 GB VRAM

Compatible Hardware

HardwareVRAMTypeFitEst. Speed
Mac Studio M4 Ultra 512GB512 GBmacRuns~4 tok/s
Mac Pro M2 Ultra 192GB192 GBmacCPU Offload~1 tok/s
Mac Studio M4 Ultra 192GB192 GBmacCPU Offload~1 tok/s
104 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

82.0
mmlu