AI Powerhouse Unveils Groundbreaking Model with Four Major Innovations and Broad Domestic Chip Support

Deep News
Yesterday

Leading AI firm Zhipu has disclosed technical details of its latest model. This afternoon, Zhipu, often referred to as the first major global listed large language model company, officially released the GLM-5 technical report. Zhipu stated that the significant performance leap of GLM-5 is primarily due to four major technological innovations. GLM-5 demonstrates unprecedented capability in real-world programming tasks, surpassing all previous open-source benchmarks in handling end-to-end software engineering challenges.

Regarding stock performance, on Friday, February 20th, Zhipu's stock price surged 42.72% in a single day, reaching HKD 725 per share, setting a new record high. The company's total market capitalization reached HKD 323.2 billion. Since its listing 43 days ago, the stock has accumulated a gain of over 500%. Following the release of GLM-5, and due to overwhelming demand, Zhipu announced price increases for its GLM Coding Plan subscription packages. Prices in China were raised by 30%, while the overseas version saw an increase of over 100%.

Zhipu's Latest Release On the afternoon of February 22nd, according to a post on Zhipu's official social media account, the company launched GLM-5, a next-generation foundation model designed to shift the programming paradigm from "VibeCoding" to "Agentic Engineering." Building upon the Agentic, Reasoning, and Coding (ARC) capabilities of its predecessor, GLM-4.5, GLM-5 incorporates DeepSeek Sparse Attention (DSA) to significantly reduce inference costs while maintaining full long-context capabilities.

To better align the model with various tasks, Zhipu constructed a new asynchronous Reinforcement Learning (RL) infrastructure. By decoupling the generation process from the training process, this infrastructure greatly enhances post-training iteration efficiency. Furthermore, Zhipu introduced a novel asynchronous Agent RL algorithm, further improving the effectiveness of reinforcement learning and enabling the model to learn more efficiently from complex, long-horizon interactions.

Zhipu claims that, based on these innovations, GLM-5 achieves State-of-the-Art (SOTA) performance on mainstream open benchmarks. Most crucially, GLM-5 exhibits unprecedented ability in real-world programming tasks, exceeding all prior open-source baselines in tackling end-to-end software engineering challenges.

Zhipu pointed out that GLM-5 represents a leap in both performance and computational efficiency. It not only achieves SOTA levels on major leaderboards like ArtificialAnalysis.ai and LMArena for text and code but also redefines real-world programming standards. It pushes beyond the boundaries of traditional static evaluations like SWE-bench, demonstrating formidable power when handling complex, end-to-end software development tasks.

Four Major Technological Innovations According to the GLM-5 technical report, the substantial performance leap is attributed to the following four major innovations:

First, the introduction of the DeepSeek Sparse Attention (DSA) mechanism. This novel architecture significantly reduces training and inference costs. While the previous GLM-4.5 relied on a standard Mixture-of-Experts (MoE) architecture for efficiency, the DSA mechanism enables GLM-5 to dynamically allocate attention resources based on token importance. This drastically reduces computational overhead without compromising long-context understanding or reasoning depth. Leveraging this, Zhipu successfully scaled the model parameters to 744 billion and increased the training token count to 28.5 trillion.

Second, the construction of a new asynchronous RL infrastructure. Building on the "decoupling of training and inference" design from the GLM-4.5 era's "slime" framework, Zhipu's new infrastructure achieves a deeper decoupling of "generation and training," pushing GPU utilization to its limits. This system supports large-scale Agent trajectory exploration by the model, significantly alleviating the synchronous bottlenecks that previously slowed iteration speed, resulting in a qualitative leap in the efficiency of the post-training RL pipeline.

Third, the proposal of a new asynchronous Agent RL algorithm. This algorithm aims to comprehensively enhance the model's autonomous decision-making quality. While GLM-4.5 relied on iterative self-distillation and outcome supervision to train Agents, the asynchronous algorithm developed for GLM-5 allows the model to continuously learn from diverse, long-horizon interactions. This algorithm is deeply optimized for planning and self-correction capabilities in dynamic environments, which is the underlying logic for GLM-5's excellent performance in real programming scenarios.

Fourth, full embrace of the domestic computing ecosystem. From its initial release, GLM-5 is natively adapted for the Chinese GPU ecosystem. Zhipu has completed deep optimizations from the underlying kernels to the upper-level inference framework, achieving full compatibility with seven major domestic chip platforms: Huawei Ascend, Moore Threads, Hygon, Cambricon, Kunlunxin, Tianshu Zhixin, and Enflame.

Zhipu stated, "With the above advancements, GLM-5 is not only a more powerful model but also a more efficient and practical foundation model for the next generation of AI Agents. We are open-sourcing GLM-5 to the community to further promote the development of efficient, Agent-oriented artificial general intelligence."

Zhipu's Apology On the evening of February 21st, Zhipu published an apology letter regarding the GLM Coding Plan on its "Zhipu Open Platform" WeChat official account and announced handling and compensation measures.

Zhipu stated that this update primarily made three mistakes: insufficient rule transparency, an overly slow rollout schedule for GLM-5 access, and a poorly designed upgrade mechanism for existing users.

The GLM Coding Plan is a paid subscription service specifically launched by Zhipu for AI programming scenarios. Upon subscription, developers can use Zhipu's large models to assist in writing code. The plan tiers are typically divided into Lite, Pro, and Max, corresponding to different usage quotas and model access permissions.

It is understood that the GLM Coding Plan sold out immediately upon launch, a rare occurrence in the industry for a paid subscription package for a domestic AI programming model.

The火爆 demand negatively impacted the user experience for the GLM Coding Plan. In the apology letter, Zhipu explained that it recently suffered attacks from gray market account pools and scalpers, which maliciously occupied a large amount of company resources. Simultaneously, following the release of GLM-5, traffic exceeded expectations, and the company's capacity expansion pace did not keep up, forcing a gradual rollout of GLM-5 access in the order of Max, Pro, and Lite users.

Currently, access is fully open for Max users. While Pro users have access, they may encounter rate limiting during peak hours due to high cluster load. Lite users will be gradually granted access in a phased rollout after the holiday period during off-peak hours.

For affected Lite and Pro users, Zhipu supports self-service refund applications.

Previously, on February 12th, Zhipu released its new flagship model, GLM-5, which gained significant popularity overseas. In terms of Coding and Agent capabilities, GLM-5 achieved open-source SOTA performance, with user experience in real programming scenarios approaching that of Claude Opus 4.5, excelling in complex system engineering and long-horizon Agent tasks.

Following the release of GLM-5, and due to supply falling short of demand, Zhipu announced price increases for its GLM Coding Plan subscription packages. Prices in China were raised by 30%, while the overseas version saw an increase of over 100%, making Zhipu the first AI-native company in China to raise prices for its commercial large model services.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10