Qwen 2.5 Coder 32B

by Alibaba · qwen-2.5 family

32B

parameters

text-generation code-generation reasoning

Qwen 2.5 Coder 32B is the flagship of the Qwen 2.5 Coder series and one of the strongest open-source coding models available. It rivals GPT-4o on code generation benchmarks and supports 128K context for working with large codebases. At Q4 it needs about 23 GB VRAM — fits on a RTX 3090/5090 or Mac with 24 GB+ unified memory. The go-to choice for developers who want the best possible local coding assistant and have the hardware to support it.

Quick Start with Ollama

ollama run 32b-instruct-q4_K_M
Resources Ollama Hugging Face Official Page
Creator Alibaba
Parameters 32B
Architecture transformer-decoder
Context 128K tokens
Released Nov 12, 2024
License Apache 2.0
Ollama qwen2.5-coder:32b

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M rec 20 GB 23 GB
32b-instruct-q4_K_M
Q8_0 34.5 GB 39 GB
32b-instruct-q8_0
F16 65 GB 70 GB
32b-instruct-fp16

Compatible Hardware

for Q4_K_M (23 GB VRAM)

Compatible Hardware

Hardware VRAM Type Fit Est. Speed
Mac Studio M4 Ultra 512GB 512 GB mac Runs ~36 tok/s
Mac Pro M2 Ultra 192GB 192 GB mac Runs ~35 tok/s
Mac Studio M4 Ultra 192GB 192 GB mac Runs ~36 tok/s
Mac Studio M4 Max 128GB 128 GB mac Runs ~24 tok/s
MacBook Pro M4 Max 128GB 128 GB mac Runs ~24 tok/s
MacBook Pro M3 Max 96GB 96 GB mac Runs ~17 tok/s
Mac mini M4 Pro 64GB 64 GB mac Runs ~12 tok/s
Mac Studio M4 Max 64GB 64 GB mac Runs ~24 tok/s
MacBook Pro M4 Max 64GB 64 GB mac Runs ~24 tok/s
Mac mini M4 Pro 48GB 48 GB mac Runs ~12 tok/s
MacBook Pro M3 Max 48GB 48 GB mac Runs ~17 tok/s
MacBook Pro M4 Max 48GB 48 GB mac Runs ~24 tok/s
MacBook Pro M4 Pro 48GB 48 GB mac Runs ~12 tok/s
Mac Studio M4 Max 36GB 36 GB mac Runs ~24 tok/s
MacBook Pro M3 Pro 36GB 36 GB mac Runs ~7 tok/s
NVIDIA GeForce RTX 5090 32 GB gpu Runs ~78 tok/s
iMac M4 32GB 32 GB mac Runs ~5 tok/s
Mac mini M4 32GB 32 GB mac Runs ~5 tok/s
MacBook Air M4 32GB 32 GB mac Runs ~5 tok/s
AMD Radeon RX 7900 XTX 24 GB gpu CPU Offload ~42 tok/s
NVIDIA GeForce RTX 3090 24 GB gpu CPU Offload ~41 tok/s
NVIDIA GeForce RTX 4090 24 GB gpu CPU Offload ~44 tok/s
iMac M3 24GB 24 GB mac CPU Offload ~4 tok/s
Mac mini M2 24GB 24 GB mac CPU Offload ~4 tok/s
Mac mini M4 Pro 24GB 24 GB mac CPU Offload ~12 tok/s
MacBook Air M2 24GB 24 GB mac CPU Offload ~4 tok/s
MacBook Air M4 24GB 24 GB mac CPU Offload ~5 tok/s
MacBook Pro M4 Pro 24GB 24 GB mac CPU Offload ~12 tok/s
AMD Radeon RX 7900 XT 20 GB gpu CPU Offload ~35 tok/s
MacBook Pro M3 Pro 18GB 18 GB mac CPU Offload ~7 tok/s
AMD Radeon RX 6800 XT 16 GB gpu CPU Offload ~22 tok/s
AMD Radeon RX 7800 XT 16 GB gpu CPU Offload ~27 tok/s
Intel Arc A770 16 GB gpu CPU Offload ~24 tok/s
NVIDIA GeForce RTX 4060 Ti 16GB 16 GB gpu CPU Offload ~13 tok/s
NVIDIA GeForce RTX 4070 Ti Super 16 GB gpu CPU Offload ~29 tok/s
NVIDIA GeForce RTX 4080 Super 16 GB gpu CPU Offload ~32 tok/s
NVIDIA GeForce RTX 4080 16 GB gpu CPU Offload ~31 tok/s
NVIDIA GeForce RTX 5070 Ti 16 GB gpu CPU Offload ~39 tok/s
NVIDIA GeForce RTX 5080 16 GB gpu CPU Offload ~42 tok/s
iMac M1 16GB 16 GB mac CPU Offload ~3 tok/s
iMac M4 16GB 16 GB mac CPU Offload ~5 tok/s
Mac mini M1 16GB 16 GB mac CPU Offload ~3 tok/s
Mac mini M4 16GB 16 GB mac CPU Offload ~5 tok/s
MacBook Air M2 16GB 16 GB mac CPU Offload ~4 tok/s
MacBook Air M3 16GB 16 GB mac CPU Offload ~4 tok/s
MacBook Air M4 16GB 16 GB mac CPU Offload ~5 tok/s
MacBook Pro M1 16GB 16 GB mac CPU Offload ~3 tok/s
MacBook Pro M2 Pro 16GB 16 GB mac CPU Offload ~9 tok/s
17 hardware device(s) cannot run this model configuration.

Benchmark Scores

78.0
mmlu