GLM-5 arrives for long-horizon and agentic engineering
The new model, GLM-5, targets complex systems engineering and extended agentic tasks by scaling both model capacity and pre-training data. The release emphasizes improvements in long-term planning and resource management, with multiple access points and a staged rollout for paid subscribers.
What changed under the hood
- Model scaling: GLM-5 expands from 355B params (32B active) to 744B params (40B active) across available configurations.
- Training data: Pre-training tokens increased from 23T to 28.5T, reflecting a larger training corpus.
- Performance focus: Architecture and training choices emphasize capabilities for frontend, backend, and extended-horizon tasks.
These core numbers define the most relevant technical differences for developers and researchers comparing iterations.
Evaluation highlights
On the internal suite CC-Bench-V2, GLM-5 shows substantial improvement over GLM-4.7 across frontend, backend, and long-horizon evaluations, reducing the gap toward Claude Opus 4.5.
In a long-horizon scenario benchmark, Vending Bench 2, GLM-5 ranked #1 among open-source models, ending with a final account balance of $4,432. That result indicates notable gains in long-term planning and resource allocation compared with prior open-source baselines and approaching Claude Opus 4.5 on these tasks.
Access, weights, and integration
- Interactive demo: http://chat.z.ai
- Model weights: http://huggingface.co/zai-org/GLM-5
- OpenRouter support: http://openrouter.ai/z-ai/glm-5
- Technical write-up: http://z.ai/blog/glm-5
For subscribers to GLM Coding Plan, the rollout is staged due to compute constraints:
- Max plan users: Can enable GLM-5 immediately by updating the model name to "GLM-5" (for example, edit ~/.claude/settings.json for Claude Code).
- Other plan tiers: Support will be added progressively as rollout expands.
- Quota note: Requests to GLM-5 consume more plan quota than GLM-4.7.
Practical implications
The combination of larger active parameter counts and more training tokens directly targets workloads requiring sustained reasoning and multi-step orchestration. The evaluation outcomes suggest GLM-5 may be particularly relevant where long-horizon planning and resource management are central concerns, while the staged rollout and higher quota consumption will factor into integration planning for subscription users.
For further details and the full technical discussion, see the original post: http://z.ai/blog/glm-5
