GLM-5
by Zhipu AI · glm family
744B
parameters
text-generation code-generation reasoning multilingual tool-use math
GLM-5 is Zhipu AI's flagship reasoning model — a 744B parameter Mixture-of-Experts with 40B active parameters per token. It achieves state-of-the-art performance on reasoning and agentic benchmarks, competing with the best closed-source models. At 281 GB even at aggressive 2-bit quantization, GLM-5 requires enterprise-grade hardware — multiple high-VRAM GPUs or a Mac Studio/Pro with 300 GB+ unified memory. Not practical for consumer hardware, but available through Ollama for those with the resources.
Quick Start with Ollama
ollama run latest Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q2_K rec | 281 GB | 300 GB | | latest |
Compatible Hardware
Q2_K requires 300 GB VRAM