Skip to content

Mixtral 8x7B

Apache 2.0

Mistral AI · 47B · transformer-decoder

2023-12-11 33K context 47B params

Use Cases

chat code reasoning multilingual math writing summary

Quantization Options

QuantBitsVRAMQualityStatus
Q4_K_Mrec429.7 GBGood
Q5_K_M534.4 GBGood
Q8_0849.0 GBExcellent

About this model

Mixtral 8x7B is Mistral AI's mixture-of-experts model, utilizing eight expert networks of 7B parameters each with a routing mechanism that activates two experts per token. This architecture gives it 47B total parameters but only uses about 13B during inference, providing excellent efficiency. The model delivers performance competitive with much larger dense models while maintaining faster inference speeds. It excels at reasoning, multilingual tasks, and code generation, and is particularly well-suited for users who need high-quality output with reasonable hardware requirements.

Benchmarks

70.6
mmlu