Llama 4 Maverick

Name: Llama 4 Maverick
Author: Meta

Llama 4 Community License

Meta · 400B · mixture-of-experts

🤗 HuggingFace Ollama Official

2025-04-05 1.0M context 400B params

Use Cases

chat code reasoning multilingual vision math tools writing summary

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	228.0 GB	Good	—
Q8_0	8	410.0 GB	Excellent	—

About this model

Llama 4 Maverick is Meta's flagship mixture-of-experts model with 400B total parameters (17B active per token) across 128 experts. It features native multimodal support for text and image inputs, a 1M token context window, and strong performance across reasoning, coding, and multilingual tasks. Maverick delivers competitive results against top proprietary models while being open-weight. Its massive expert count enables broad knowledge coverage, though the full 400B parameter count means all expert weights must be loaded into memory for inference.

Benchmarks

82.0

mmlu

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run llama4:maverick-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 400B
Architecture: mixture-of-experts
Context: 1.0M tokens
Min VRAM: 228.0 GB
Recommended: 228.0 GB
Family: Llama 4
Released: 2025-04-05
License: Llama 4 Community License