Skip to content

Mixtral 8x7B

by Mistral AI · mistral family

47B

parameters

text-generation code-generation reasoning multilingual math creative-writing summarization

Mixtral 8x7B is Mistral AI's mixture-of-experts model, utilizing eight expert networks of 7B parameters each with a routing mechanism that activates two experts per token. This architecture gives it 47B total parameters but only uses about 13B during inference, providing excellent efficiency. The model delivers performance competitive with much larger dense models while maintaining faster inference speeds. It excels at reasoning, multilingual tasks, and code generation, and is particularly well-suited for users who need high-quality output with reasonable hardware requirements.

Quick Start with Ollama

ollama run 8x7b-instruct-v0.1-q4_K_M
Resources Ollama Hugging Face Official Page Research Paper
Creator Mistral AI
Parameters 47B
Architecture transformer-decoder
Context 32K tokens
Released Dec 11, 2023
License Apache 2.0
Ollama mixtral

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 22.6 GB 29.7 GB 8x7b-instruct-v0.1-q4_K_M
Q5_K_M 26.3 GB 34.4 GB 8x7b-instruct-v0.1-q5_K_M
Q8_0 44.1 GB 49 GB 8x7b-instruct-v0.1-q8_0

Compatible Hardware

Q4_K_M requires 29.7 GB VRAM

Compatible Hardware

HardwareVRAMTypeFitEst. Speed
Mac Studio M4 Ultra 512GB512 GBmacRuns~28 tok/s
Mac Pro M2 Ultra 192GB192 GBmacRuns~27 tok/s
Mac Studio M4 Ultra 192GB192 GBmacRuns~28 tok/s
Mac Studio M4 Max 128GB128 GBmacRuns~18 tok/s
MacBook Pro M4 Max 128GB128 GBmacRuns~18 tok/s
MacBook Pro M5 Max 128GB128 GBmacRuns~18 tok/s
NVIDIA RTX PRO 6000 Blackwell96 GBgpuRuns~65 tok/s
MacBook Pro M3 Max 96GB96 GBmacRuns~13 tok/s
Mac mini M4 Pro 64GB64 GBmacRuns~9 tok/s
Mac Studio M4 Max 64GB64 GBmacRuns~18 tok/s
MacBook Pro M4 Max 64GB64 GBmacRuns~18 tok/s
MacBook Pro M5 Max 64GB64 GBmacRuns~18 tok/s
NVIDIA RTX 6000 Ada Generation48 GBgpuRuns~32 tok/s
NVIDIA RTX A600048 GBgpuRuns~26 tok/s
NVIDIA RTX PRO 5000 Blackwell48 GBgpuRuns~32 tok/s
Mac mini M4 Pro 48GB48 GBmacRuns~9 tok/s
MacBook Pro M3 Max 48GB48 GBmacRuns~13 tok/s
MacBook Pro M4 Pro 48GB48 GBmacRuns~9 tok/s
MacBook Pro M4 Max 48GB48 GBmacRuns~18 tok/s
MacBook Pro M5 Max 48GB48 GBmacRuns~14 tok/s
MacBook Pro M5 Pro 48GB48 GBmacRuns~9 tok/s
Mac Studio M4 Max 36GB36 GBmacRuns~18 tok/s
MacBook Pro M3 Pro 36GB36 GBmacRuns~5 tok/s
MacBook Pro M5 Max 36GB36 GBmacRuns~14 tok/s
NVIDIA RTX 5000 Ada Generation32 GBgpuRuns (tight)~24 tok/s
NVIDIA GeForce RTX 509032 GBgpuRuns (tight)~60 tok/s
iMac M4 32GB32 GBmacRuns (tight)~4 tok/s
Mac mini M4 32GB32 GBmacRuns (tight)~4 tok/s
MacBook Air M4 32GB32 GBmacRuns (tight)~4 tok/s
MacBook Air M5 32GB32 GBmacRuns (tight)~4 tok/s
MacBook Pro M5 32GB32 GBmacRuns (tight)~4 tok/s
AMD Radeon RX 7900 XTX24 GBgpuCPU Offload~10 tok/s
NVIDIA GeForce RTX 3090 Ti24 GBgpuCPU Offload~10 tok/s
NVIDIA GeForce RTX 309024 GBgpuCPU Offload~10 tok/s
NVIDIA GeForce RTX 409024 GBgpuCPU Offload~10 tok/s
NVIDIA RTX A500024 GBgpuCPU Offload~8 tok/s
iMac M3 24GB24 GBmacCPU Offload~1 tok/s
Mac mini M2 24GB24 GBmacCPU Offload~1 tok/s
Mac mini M4 Pro 24GB24 GBmacCPU Offload~3 tok/s
MacBook Air M2 24GB24 GBmacCPU Offload~1 tok/s
MacBook Air M4 24GB24 GBmacCPU Offload~1 tok/s
MacBook Air M5 24GB24 GBmacCPU Offload~1 tok/s
MacBook Pro M4 Pro 24GB24 GBmacCPU Offload~3 tok/s
MacBook Pro M5 24GB24 GBmacCPU Offload~1 tok/s
MacBook Pro M5 Pro 24GB24 GBmacCPU Offload~3 tok/s
AMD Radeon RX 7900 XT20 GBgpuCPU Offload~8 tok/s
NVIDIA RTX 4000 Ada Generation20 GBgpuCPU Offload~4 tok/s
60 hardware device(s) cannot run this model at Q4_K_M.

Benchmark Scores

70.6
mmlu