Llama 4 Scout (109B/17B active)
by Meta · llama-4 family
109B
parameters
text-generation code-generation reasoning multilingual vision math tool-use creative-writing summarization
Llama 4 Scout is Meta's mixture-of-experts model with 109B total parameters but only 17B active per token across 16 experts. It's natively multimodal (text + images) and supports an unprecedented 10M token context window. At Q4 it needs about 72 GB — too large for a single consumer GPU but fits on Macs with 96-128 GB unified memory, or multi-GPU setups. Despite the large memory footprint, inference speed benefits from only 17B active params. The most capable open-weight model from Meta.
Quick Start with Ollama
ollama run scout-q4_K_M | Creator | Meta |
| Parameters | 109B |
| Architecture | mixture-of-experts |
| Context | 512K tokens |
| Released | Apr 5, 2025 |
| License | Llama 4 Community License |
| Ollama | llama4:scout |
Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q4_K_M rec | 67 GB | 72 GB |
★
★
★
★
★
| scout-q4_K_M |
| Q8_0 | 117 GB | 125 GB |
★
★
★
★
★
| scout-q8_0 |
Compatible Hardware
for Q4_K_M (72 GB VRAM)
Compatible Hardware
| Hardware | VRAM | Type | Fit | Est. Speed |
|---|---|---|---|---|
| Mac Studio M4 Ultra 512GB | 512 GB | mac | Runs | ~11 tok/s |
| Mac Pro M2 Ultra 192GB | 192 GB | mac | Runs | ~11 tok/s |
| Mac Studio M4 Ultra 192GB | 192 GB | mac | Runs | ~11 tok/s |
| Mac Studio M4 Max 128GB | 128 GB | mac | Runs | ~8 tok/s |
| MacBook Pro M4 Max 128GB | 128 GB | mac | Runs | ~8 tok/s |
| MacBook Pro M3 Max 96GB | 96 GB | mac | Runs | ~6 tok/s |
| Mac mini M4 Pro 64GB | 64 GB | mac | CPU Offload | ~4 tok/s |
| Mac Studio M4 Max 64GB | 64 GB | mac | CPU Offload | ~8 tok/s |
| MacBook Pro M4 Max 64GB | 64 GB | mac | CPU Offload | ~8 tok/s |
| Mac mini M4 Pro 48GB | 48 GB | mac | CPU Offload | ~4 tok/s |
| MacBook Pro M3 Max 48GB | 48 GB | mac | CPU Offload | ~6 tok/s |
| MacBook Pro M4 Max 48GB | 48 GB | mac | CPU Offload | ~8 tok/s |
| MacBook Pro M4 Pro 48GB | 48 GB | mac | CPU Offload | ~4 tok/s |
52 hardware
device(s) cannot run this model configuration.
Benchmark Scores
80.0
mmlu