GLM-5 Released: Long-Horizon Planning with Larger Model and Data

GLM-5 expands to 744B params (40B active) and 28.5T tokens to improve long-horizon planning, resource management, and agentic tasks. Internal benchmarks report major gains over GLM-4.7 and top open-source performance on long-horizon Vending Bench 2.

z-ai cover

TL;DR

  • Model scaling: 744B params (40B active), up from 355B (32B active), across available configurations
  • Training data increase: Pre-training tokens raised from 23T to 28.5T
  • Performance focus: Emphasis on frontend, backend, and extended-horizon tasks; improved long-term planning and resource management
  • Evaluation highlights: Internal CC-Bench-V2 shows gains vs GLM-4.7; smaller gap to Claude Opus 4.5
  • Long-horizon benchmark: Vending Bench 2 ranked #1 among open-source models; final balance $4,432
  • Access and rollout: http://chat.z.ai, http://huggingface.co/zai-org/GLM-5, http://openrouter.ai/z-ai/glm-5; staged for subscribers; higher quota cost than GLM-4.7

GLM-5 arrives for long-horizon and agentic engineering

The new model, GLM-5, targets complex systems engineering and extended agentic tasks by scaling both model capacity and pre-training data. The release emphasizes improvements in long-term planning and resource management, with multiple access points and a staged rollout for paid subscribers.

What changed under the hood

  • Model scaling: GLM-5 expands from 355B params (32B active) to 744B params (40B active) across available configurations.
  • Training data: Pre-training tokens increased from 23T to 28.5T, reflecting a larger training corpus.
  • Performance focus: Architecture and training choices emphasize capabilities for frontend, backend, and extended-horizon tasks.

These core numbers define the most relevant technical differences for developers and researchers comparing iterations.

Evaluation highlights

On the internal suite CC-Bench-V2, GLM-5 shows substantial improvement over GLM-4.7 across frontend, backend, and long-horizon evaluations, reducing the gap toward Claude Opus 4.5.

In a long-horizon scenario benchmark, Vending Bench 2, GLM-5 ranked #1 among open-source models, ending with a final account balance of $4,432. That result indicates notable gains in long-term planning and resource allocation compared with prior open-source baselines and approaching Claude Opus 4.5 on these tasks.

Access, weights, and integration

For subscribers to GLM Coding Plan, the rollout is staged due to compute constraints:

  • Max plan users: Can enable GLM-5 immediately by updating the model name to "GLM-5" (for example, edit ~/.claude/settings.json for Claude Code).
  • Other plan tiers: Support will be added progressively as rollout expands.
  • Quota note: Requests to GLM-5 consume more plan quota than GLM-4.7.

Practical implications

The combination of larger active parameter counts and more training tokens directly targets workloads requiring sustained reasoning and multi-step orchestration. The evaluation outcomes suggest GLM-5 may be particularly relevant where long-horizon planning and resource management are central concerns, while the staged rollout and higher quota consumption will factor into integration planning for subscription users.

For further details and the full technical discussion, see the original post: http://z.ai/blog/glm-5

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community