Qwen 2.5 VL 72B
Apache 2.0Alibaba · 72B · transformer-decoder
2025-01-26 33K context
72B params
Use Cases
chat vision reasoning multilingual math
Quantization Options
About this model
Qwen 2.5 VL 72B is Alibaba's flagship vision-language model, offering top-tier multimodal performance at 72 billion parameters. It excels at complex visual reasoning, mathematical problem solving from images, document analysis, and multilingual visual understanding.
This model delivers frontier-level vision capabilities with strong performance on benchmarks like MathVista and DocVQA, making it a powerful choice for demanding multimodal workloads when sufficient hardware is available.
Benchmarks
85.0
mmlu