Qwen 2.5 72B

Name: Qwen 2.5 72B
Author: Alibaba

72B

parameters

text-generation code-generation reasoning multilingual tool-use math creative-writing summarization

Qwen 2.5 72B is the flagship model of the Qwen 2.5 series, delivering frontier-class performance on reasoning, coding, math, and multilingual benchmarks. It competes directly with the best open models at the 70B scale from Meta and others. This model requires multi-GPU setups for comfortable inference but rewards users with exceptional output quality. It supports 128K context, advanced tool use, and performs strongly across 29+ languages with particular excellence in Chinese and English.

Quick Start with Ollama


ollama run 72b-instruct-q4_K_M

Creator	Alibaba
Parameters	72B
Architecture	transformer-decoder
Context Length	128K tokens
License	Qwen License
Released	Sep 19, 2024
Ollama	qwen2.5:72b

Quantization Options

Format	File Size	VRAM Required	Quality	Ollama Tag
Q4_K_M recommended	35.8 GB	44.7 GB	★ ★ ★ ★ ★	`72b-instruct-q4_K_M`
Q5_K_M	41.8 GB	51.9 GB	★ ★ ★ ★ ★	`72b-instruct-q5_K_M`
Q8_0	66.6 GB	74 GB	★ ★ ★ ★ ★	`72b-instruct-q8_0`

Compatible Hardware for Q4_K_M

Showing compatibility for the recommended quantization (Q4_K_M, 44.7 GB VRAM).

Compatible Hardware

Hardware	VRAM	Type	Fit
Mac Pro M2 Ultra 192GB	192 GB	mac	Runs
Mac Studio M4 Ultra 192GB	192 GB	mac	Runs
Mac Studio M4 Max 128GB	128 GB	mac	Runs
MacBook Pro M4 Max 128GB	128 GB	mac	Runs
Mac Studio M4 Max 64GB	64 GB	mac	Runs
MacBook Pro M4 Max 64GB	64 GB	mac	Runs
Mac mini M4 Pro 48GB	48 GB	mac	Runs (tight)
MacBook Pro M4 Max 48GB	48 GB	mac	Runs (tight)
MacBook Pro M4 Pro 48GB	48 GB	mac	Runs (tight)
NVIDIA GeForce RTX 5090	32 GB	gpu	CPU Offload
Mac mini M4 32GB	32 GB	mac	CPU Offload

25 hardware device(s) cannot run this model configuration.

Benchmark Scores

85.3

mmlu