Qwen 2.5 72B

by Alibaba · qwen-2.5 family

72B

parameters

text-generation code-generation reasoning multilingual tool-use math creative-writing summarization

Qwen 2.5 72B is the flagship model of the Qwen 2.5 series, delivering frontier-class performance on reasoning, coding, math, and multilingual benchmarks. It competes directly with the best open models at the 70B scale from Meta and others. This model requires multi-GPU setups for comfortable inference but rewards users with exceptional output quality. It supports 128K context, advanced tool use, and performs strongly across 29+ languages with particular excellence in Chinese and English.

Quick Start with Ollama

ollama run 72b-instruct-q4_K_M
Creator Alibaba
Parameters 72B
Architecture transformer-decoder
Context Length 128K tokens
License Qwen License
Released Sep 19, 2024
Ollama qwen2.5:72b

Quantization Options

Format File Size VRAM Required Quality Ollama Tag
Q4_K_M recommended 35.8 GB 44.7 GB
72b-instruct-q4_K_M
Q5_K_M 41.8 GB 51.9 GB
72b-instruct-q5_K_M
Q8_0 66.6 GB 74 GB
72b-instruct-q8_0

Compatible Hardware for Q4_K_M

Showing compatibility for the recommended quantization (Q4_K_M, 44.7 GB VRAM).

Compatible Hardware

Hardware VRAM Type Fit
Mac Pro M2 Ultra 192GB 192 GB mac Runs
Mac Studio M4 Ultra 192GB 192 GB mac Runs
Mac Studio M4 Max 128GB 128 GB mac Runs
MacBook Pro M4 Max 128GB 128 GB mac Runs
Mac Studio M4 Max 64GB 64 GB mac Runs
MacBook Pro M4 Max 64GB 64 GB mac Runs
Mac mini M4 Pro 48GB 48 GB mac Runs (tight)
MacBook Pro M4 Max 48GB 48 GB mac Runs (tight)
MacBook Pro M4 Pro 48GB 48 GB mac Runs (tight)
NVIDIA GeForce RTX 5090 32 GB gpu CPU Offload
Mac mini M4 32GB 32 GB mac CPU Offload
25 hardware device(s) cannot run this model configuration.

Benchmark Scores

85.3
mmlu