Skip to content

Mixtral 8x22B

by Mistral AI · mistral family

141B

parameters

text-generation code-generation reasoning multilingual math tool-use

Mixtral 8x22B is Mistral AI's large-scale mixture-of-experts model featuring 141 billion total parameters with 8 expert groups of 22 billion parameters each. It activates only a subset of experts per token, delivering strong performance with more efficient inference than a comparably sized dense model. The model supports a 64K context window, native function calling, and multilingual generation. It excels at code generation, mathematical reasoning, and tool use, making it well-suited for complex agentic and retrieval-augmented workflows.

Quick Start with Ollama

ollama run 8x22b-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Mistral AI
Parameters 141B
Architecture mixture-of-experts
Context 64K tokens
Released Apr 17, 2024
License Apache 2.0
Ollama mixtral:8x22b

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 80 GB 86 GB 8x22b-q4_K_M
Q8_0 141 GB 148 GB 8x22b-q8_0

Compatible Hardware

Q4_K_M requires 86 GB VRAM

Compatible Hardware

HardwareVRAMTypeFitEst. Speed
Mac Studio M4 Ultra 512GB512 GBmacRuns~10 tok/s
Mac Pro M2 Ultra 192GB192 GBmacRuns~9 tok/s
Mac Studio M4 Ultra 192GB192 GBmacRuns~10 tok/s
Mac Studio M4 Max 128GB128 GBmacRuns~6 tok/s
MacBook Pro M4 Max 128GB128 GBmacRuns~6 tok/s
MacBook Pro M5 Max 128GB128 GBmacRuns~6 tok/s
NVIDIA RTX PRO 6000 Blackwell96 GBgpuRuns (tight)~22 tok/s
MacBook Pro M3 Max 96GB96 GBmacRuns (tight)~5 tok/s
Mac mini M4 Pro 64GB64 GBmacCPU Offload~1 tok/s
Mac Studio M4 Max 64GB64 GBmacCPU Offload~2 tok/s
MacBook Pro M4 Max 64GB64 GBmacCPU Offload~2 tok/s
MacBook Pro M5 Max 64GB64 GBmacCPU Offload~2 tok/s
95 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

77.8
mmlu