Devstral 24B

by Mistral AI · mistral family

24B

parameters

text-generation code-generation reasoning

Devstral 24B is Mistral's dedicated coding agent model, fine-tuned from Mistral Small 3.1 for software engineering tasks. It excels at code generation, repository-scale understanding, debugging, and agentic coding workflows. Ranked #1 among open-source coding agent models at launch. At Q4 it fits on 24 GB GPUs — ideal for developers who want a local alternative to cloud-based coding assistants like GitHub Copilot.

Quick Start with Ollama

ollama run 24b-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Mistral AI
Parameters 24B
Architecture transformer-decoder
Context 128K tokens
Released May 21, 2025
License Apache 2.0
Ollama devstral

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 14 GB 17 GB
24b-q4_K_M
Q8_0 25 GB 29 GB
24b-q8_0
F16 48 GB 53 GB
24b-fp16

Compatible Hardware

for Q4_K_M (17 GB VRAM)

Compatible Hardware

Hardware VRAM Type Fit Est. Speed
Mac Studio M4 Ultra 512GB 512 GB mac Runs ~48 tok/s
Mac Pro M2 Ultra 192GB 192 GB mac Runs ~47 tok/s
Mac Studio M4 Ultra 192GB 192 GB mac Runs ~48 tok/s
Mac Studio M4 Max 128GB 128 GB mac Runs ~32 tok/s
MacBook Pro M4 Max 128GB 128 GB mac Runs ~32 tok/s
MacBook Pro M3 Max 96GB 96 GB mac Runs ~24 tok/s
Mac mini M4 Pro 64GB 64 GB mac Runs ~16 tok/s
Mac Studio M4 Max 64GB 64 GB mac Runs ~32 tok/s
MacBook Pro M4 Max 64GB 64 GB mac Runs ~32 tok/s
Mac mini M4 Pro 48GB 48 GB mac Runs ~16 tok/s
MacBook Pro M3 Max 48GB 48 GB mac Runs ~24 tok/s
MacBook Pro M4 Max 48GB 48 GB mac Runs ~32 tok/s
MacBook Pro M4 Pro 48GB 48 GB mac Runs ~16 tok/s
Mac Studio M4 Max 36GB 36 GB mac Runs ~32 tok/s
MacBook Pro M3 Pro 36GB 36 GB mac Runs ~9 tok/s
NVIDIA GeForce RTX 5090 32 GB gpu Runs ~105 tok/s
iMac M4 32GB 32 GB mac Runs ~7 tok/s
Mac mini M4 32GB 32 GB mac Runs ~7 tok/s
MacBook Air M4 32GB 32 GB mac Runs ~7 tok/s
AMD Radeon RX 7900 XTX 24 GB gpu Runs ~56 tok/s
NVIDIA GeForce RTX 3090 24 GB gpu Runs ~55 tok/s
NVIDIA GeForce RTX 4090 24 GB gpu Runs ~59 tok/s
iMac M3 24GB 24 GB mac Runs ~6 tok/s
Mac mini M2 24GB 24 GB mac Runs ~6 tok/s
Mac mini M4 Pro 24GB 24 GB mac Runs ~16 tok/s
MacBook Air M2 24GB 24 GB mac Runs ~6 tok/s
MacBook Air M4 24GB 24 GB mac Runs ~7 tok/s
MacBook Pro M4 Pro 24GB 24 GB mac Runs ~16 tok/s
AMD Radeon RX 7900 XT 20 GB gpu Runs ~47 tok/s
MacBook Pro M3 Pro 18GB 18 GB mac Runs (tight) ~9 tok/s
AMD Radeon RX 6800 XT 16 GB gpu CPU Offload ~30 tok/s
AMD Radeon RX 7800 XT 16 GB gpu CPU Offload ~37 tok/s
Intel Arc A770 16 GB gpu CPU Offload ~33 tok/s
NVIDIA GeForce RTX 4060 Ti 16GB 16 GB gpu CPU Offload ~17 tok/s
NVIDIA GeForce RTX 4070 Ti Super 16 GB gpu CPU Offload ~40 tok/s
NVIDIA GeForce RTX 4080 Super 16 GB gpu CPU Offload ~43 tok/s
NVIDIA GeForce RTX 4080 16 GB gpu CPU Offload ~42 tok/s
NVIDIA GeForce RTX 5070 Ti 16 GB gpu CPU Offload ~53 tok/s
NVIDIA GeForce RTX 5080 16 GB gpu CPU Offload ~56 tok/s
iMac M1 16GB 16 GB mac CPU Offload ~4 tok/s
iMac M4 16GB 16 GB mac CPU Offload ~7 tok/s
Mac mini M1 16GB 16 GB mac CPU Offload ~4 tok/s
Mac mini M4 16GB 16 GB mac CPU Offload ~7 tok/s
MacBook Air M2 16GB 16 GB mac CPU Offload ~6 tok/s
MacBook Air M3 16GB 16 GB mac CPU Offload ~6 tok/s
MacBook Air M4 16GB 16 GB mac CPU Offload ~7 tok/s
MacBook Pro M1 16GB 16 GB mac CPU Offload ~4 tok/s
MacBook Pro M2 Pro 16GB 16 GB mac CPU Offload ~12 tok/s
AMD Radeon RX 7700 XT 12 GB gpu CPU Offload ~25 tok/s
NVIDIA GeForce RTX 3060 12GB 12 GB gpu CPU Offload ~21 tok/s
NVIDIA GeForce RTX 3080 12GB 12 GB gpu CPU Offload ~54 tok/s
NVIDIA GeForce RTX 4070 Super 12 GB gpu CPU Offload ~30 tok/s
NVIDIA GeForce RTX 4070 Ti 12 GB gpu CPU Offload ~30 tok/s
NVIDIA GeForce RTX 4070 12 GB gpu CPU Offload ~30 tok/s
NVIDIA GeForce RTX 5070 12 GB gpu CPU Offload ~40 tok/s
10 hardware device(s) cannot run this model configuration.

Benchmark Scores

72.0
mmlu