GLM-5.1
by Zhipu AI · glm family
parameters
GLM-5.1 is Zhipu AI's next-generation flagship model for agentic engineering, succeeding GLM-5 with significantly stronger coding and long-horizon task capabilities. A 754B parameter Mixture-of-Experts with 40B active parameters per token, it achieves state-of-the-art performance on SWE-Bench Pro (58.4), outperforming GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro. Designed for sustained autonomous work, GLM-5.1 can operate on a single task for up to 8 hours — planning, executing, testing, and iterating across hundreds of rounds and thousands of tool calls. Its MoE architecture keeps VRAM requirements manageable despite the massive parameter count, making it accessible on high-end consumer hardware at Q4_K_M quantization.
Quick Start with Ollama
ollama run latest Quantization Options
| Format | File Size | VRAM Required | Quality | Ollama Tag |
|---|---|---|---|---|
| Q2_K rec | 285 GB | 305 GB | | latest |
| Q4_K_M | 430 GB | 450 GB | | q4_K_M |
| Q8_0 | 800 GB | 820 GB | | q8_0 |
Compatible Hardware
Q2_K requires 305 GB VRAM