← back

Z.ai GLM 4.6: What We Learned From 100 Million Open Source Downloads — Yuxuan Zhang, Z.ai

5.7K views · Nov 22, 2025 · 19:39 min · Watch on YouTube ↗
Takeaway

GLM 4.6 closes the gap to frontier closed models on math/coding/agents via a multi-stage curriculum culminating in 200K-context agent training, with 100M+ open-source downloads of the series.

Summary

  • Yuxuan Zhang (Z.ai) reports 65+ open-source GLM models since 2022 with 100M+ Hugging Face/ModelScope downloads, 105K+ GitHub community projects.
  • GLM 4.6 beats DeepSeek 3.2 and Claude Sonnet 4 on multiple benchmarks, ties #1 with GPT-5 on LM Arena as the only open-source model there; still trails Claude 4.5.
  • Built their own CCBench (74 tasks, Claude Code-based) for real agentic coding eval — GLM 4.6 hits 68.6% win rate vs Claude Sonnet 4.
  • Training recipe: 15T general tokens → 7T reasoning/code → mid-training on full repo PRs/issues at 32K context → 500B synthetic reasoning → 100B long-context+agent data at 200K context.
glmopen-source-llmz-ai
Original description
GLM 4.6 is the only open-source model currently tied for #1 on the LMSYS Chatbot Arena, standing shoulder-to-shoulder with GPT-4o and Claude 3.5 Sonnet. In this talk, Zhang Yuxuan from zAI breaks down the technical roadmap that led to over 100 million downloads across the GLM family.

Zhang deep dives into the specific training recipes behind GLM 4.6, including their move to single-stage Reinforcement Learning (RL), the "SLIME" RL framework for handling complex agent trajectories, and how they structured 15 trillion tokens of pre-training data. If you are building AI Agents or training LLMs, this breakdown offers a rare look inside the architecture of a frontier-class open-source model.

In this video, we cover:

The Data Recipe: How zAI filters 15T tokens, moves to repo-level code contexts, and integrates agentic reasoning data.

SLIME Framework: A look at the hybrid synchronous/asynchronous architecture used to train agents without bottlenecking GPU clusters.

RL Lessons: Why zAI abandoned multi-stage RL in favor of single-stage training to preserve long-context capabilities.

GLM 4.5V: How native resolution processing improves UI navigation and video understanding.

Timestamps:
0:00 - Introduction & The GLM Ecosystem
0:55 - 100 Million Downloads & Open Source Roadmap
03:22 - Tying GPT-4o on LMSYS Arena
05:04 - The Training Pipeline: From Pre-training to Long Context
07:54 - Introducing SLIME: Efficient RL for Agents
11:08 - The "Two-Stage" Curriculum Strategy
11:57 - Why Single-Stage RL beats Multi-Stage RL
12:55 - Token-Weighted Loss for Coding
14:13 - GLM 4.5V: Multimodal & Video Understanding
16:07 - Deployment: vLLM, SGLang, and Hugging Face
18:06 - Coding Assistants & Future Plans

Zhang Yuxuan has recently started a PhD at the University of Liverpool and is currently working at Z.ai. zR (Zhang) is passionate about open-source initiatives and strives for deeper exploration in this realm. Their primary activities include the following: Engaged in research on models such as GLM-4.5 (https://arxiv.org/abs/2508.06471), GLM-4.5V (https://arxiv.org/abs/2507.01006), CogVideoX (https://arxiv.org/abs/2408.06072), CogAgent (https://arxiv.org/abs/2312.08914); researching the capabilities of model Agents and the integration with Agent frameworks such as langchain-chatchat (https://github.com/chatchat-space/Langchain-Chatchat), chatpdf (https://github.com/CosmosShadow/gptpdf); participated in several national competitions, such as RoboMaster and National Students' SmartCar Competition, and achieved some results, including national awards. These competitions have been truly fascinating. Enjoys hackathon competitions and welcomes teaming up for these events.

---
Socials:
- LinkedIn: https://www.linkedin.com/in/yuxuan-zhang-86a124282/
- X (Twitter): https://x.com/zRdianjiao
- GitHub: https://github.com/zRzRzRzRzRzRzR
- Website: https://huggingface.co/ZHANGYUXUAN-zR
- Company: Z.ai (https://z.ai)