Mistral 7B

by Mistral AI · mistral family

7B

parameters

text-generation code-generation multilingual summarization

Mistral 7B was a groundbreaking release from Mistral AI that demonstrated 7B parameter models could punch well above their weight class. Using grouped-query attention and sliding window attention, it outperformed larger models like Llama 2 13B on most benchmarks. This model remains a popular choice for local inference due to its excellent performance-to-size ratio. It handles general text generation, coding assistance, and multilingual tasks effectively while running on modest consumer hardware.

Quick Start with Ollama

ollama run 7b-instruct-q8_0
Creator Mistral AI
Parameters 7B
Architecture transformer-decoder
Context Length 32K tokens
License Apache 2.0
Released Sep 27, 2023
Ollama mistral

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M 3.5 GB 5.7 GB
7b-instruct-q4_K_M
Q8_0 recommended 6.3 GB 9 GB
7b-instruct-q8_0
F16 13.3 GB 16 GB
7b-instruct-fp16

Compatible Hardware for Q8_0

Showing compatibility for the recommended quantization (Q8_0, 9 GB VRAM).

Benchmark Scores

62.5
mmlu