Skip to content

Qwen 2.5 VL 72B

Apache 2.0

Alibaba · 72B · transformer-decoder

2025-01-26 33K context 72B params

Use Cases

chat vision reasoning multilingual math

Quantization Options

QuantBitsVRAMQualityStatus
Q4_K_Mrec441.0 GBGood
Q8_0878.0 GBGood

About this model

Qwen 2.5 VL 72B is Alibaba's flagship vision-language model, offering top-tier multimodal performance at 72 billion parameters. It excels at complex visual reasoning, mathematical problem solving from images, document analysis, and multilingual visual understanding. This model delivers frontier-level vision capabilities with strong performance on benchmarks like MathVista and DocVQA, making it a powerful choice for demanding multimodal workloads when sufficient hardware is available.

Benchmarks

85.0
mmlu