Mixtral 8x7B

Name: Mixtral 8x7B
Author: Mistral AI

Apache 2.0

Mistral AI · 47B · transformer-decoder

🤗 HuggingFace Ollama Official Paper

2023-12-11 33K context 47B params

Use Cases

chat code reasoning multilingual math writing summary

Quantization Options

Quant	Bits	VRAM	Quality	Status
Q4_K_Mrec	4	29.7 GB	Good	—
Q5_K_M	5	34.4 GB	Good	—
Q8_0	8	49.0 GB	Excellent	—

About this model

Mixtral 8x7B is Mistral AI's mixture-of-experts model, utilizing eight expert networks of 7B parameters each with a routing mechanism that activates two experts per token. This architecture gives it 47B total parameters but only uses about 13B during inference, providing excellent efficiency. The model delivers performance competitive with much larger dense models while maintaining faster inference speeds. It excels at reasoning, multilingual tasks, and code generation, and is particularly well-suited for users who need high-quality output with reasonable hardware requirements.

Benchmarks

70.6

mmlu

Your Hardware

DevicePick…

VRAM—

Bandwidth—

Detecting…

Install

Ollama

ollama run mixtral:8x7b-instruct-v0.1-q4_K_M

llama.cpp / GGUF

Download GGUF from HuggingFace

Specs

Parameters: 47B
Architecture: transformer-decoder
Context: 33K tokens
Min VRAM: 29.7 GB
Recommended: 29.7 GB
Family: Mistral
Released: 2023-12-11
License: Apache 2.0