Zhipu Releases AutoGLM 2.0: Can "Cloud Phones" Enable Fully Automated Agent Operations?

The competition among domestic AI Agents is heating up.

On August 20, Zhipu released its AI Agent product AutoGLM 2.0, powered by the latest open-source models GLM-4.5 and GLM-4.5V, making it available to general users. The team announced that AutoGLM is now launched across iOS, Android, and web platforms, with plans to maintain a release schedule of new features every 1-2 weeks.

AutoGLM 2.0 is positioned as an execution-type intelligent agent capable of direct "clicking, dragging, and inputting." Unlike traditional assistants that write answers in dialogue boxes, it emphasizes real functionality within devices, including cross-platform task execution across phones, computers, and browsers.

What exactly can it do? On the lifestyle front, AutoGLM can now handle operations on common high-frequency apps and websites, including food delivery, transportation, e-commerce ordering, and content platform searching and publishing. Its public demonstrations include automated operations on platforms like WeChat, TikTok, Xiaohongshu, Meituan, JD.com, and Pinduoduo.

In office scenarios, it supports everything from research and analysis to final output, implementing automated processes of opening web pages, searching, and summarizing in browsers.

A significant technical update in AutoGLM 2.0 is the combination of local and cloud-based operations. Previous mobile Agents often occupied user screens, consumed computing power, and were easily interrupted. AutoGLM 2.0 introduces a "cloud phone/cloud desktop" execution format, directly calling app services in the cloud to complete operations like food ordering and transportation, avoiding occupation of the local device screen.

In scenario demonstrations, when given the instruction "Find nearby bubble tea shops on Meituan, order 20 cups and use coupons," AutoGLM 2.0 can complete this task on a cloud phone without occupying the local device's foreground. It can recognize and skip advertisements, select categories, continuously click the quantity button to 20 cups, intelligently use an "8 yuan red envelope" (or the maximum discount coupon), requiring user confirmation only at the final payment stage.

The key experience here is that Agent tasks run in parallel in the background without interrupting current device usage, while supporting cross-platform, cross-app, and long-term task execution that can be resumed and interacted with at any time.

From a product roadmap perspective, AutoGLM 2.0 targets the same category as "device-operating" universal AI Agents like Alphabet's Project Mariner and OpenAI's ChatGPT Agent: emphasizing cross-application, multi-step, delegatable execution. The difference is that AutoGLM has prioritized "real mobile app operations" as its primary landing scenario earlier, equipped with browser and PC coordination.

It should be noted that in the current testing phase, actual operations still face issues such as task termination after jumping out, and occasional forced account logouts when repeatedly executing tasks on certain apps. In terms of coverage and stability, the adaptation level and success rates vary across different websites and apps, information sources may show bias, and complex long-chain processes occasionally miss steps. The team still needs time to troubleshoot and optimize user experience.

Additionally, permission differences between platforms and issues related to account security and privacy authorization in automated "proxy operations" will be long-term concerns for all Agent products, including AutoGLM.

Beyond phones and PCs, Zhipu appears ready to integrate with a broader smart hardware ecosystem. The team has already packaged AutoGLM's operational execution capabilities into APIs, allowing developers to integrate these capabilities into various hardware devices simply by connecting to the interface.

Cost is another dimension receiving significant attention for universal AI Agents. Liu Xiao, GLM's technical lead at Zhipu, stated that currently, completing tasks on AutoGLM costs $0.2 per task (approximately 1.4 yuan), including model and virtual machine costs. He referenced Alphabet search as a truly universal and highly valuable comparison point, noting that Alphabet search costs an estimated $0.02 per query. Despite lacking complete certainty about these estimated figures, Liu believes the two are on similar orders of magnitude.

"Agent task execution costs have now been reduced to within one order of magnitude of ordinary universal search. And this will definitely be further compressed to within one order of magnitude, or even smaller, as scale and commercialization progress," he said.

Zhipu CEO Zhang Peng stated that AutoGLM's goal is to provide ordinary people with capabilities far exceeding average human levels through its product format. He believes this is challenging because certain tasks may already exceed users' own cognitive scope of things, and learning how to use AI capabilities effectively is key to breaking through such barriers.

It's clear that Zhang Peng isn't rushing to inject higher-level missions into AutoGLM at this stage. Zhipu's plan is to take this step and attempt to use this product to break the boundaries of cognitive barriers, allowing ordinary people to gradually realize that even if they don't understand something, they might still be able to do it well because they have an AI tool that understands better than they do.

"I personally believe this is definitely a revolutionary, epoch-making thing," Zhang Peng said. "No one can say clearly how this will be achieved in the future or what product format it will take, but we hope that taking this step today has historical significance."

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tiger Brokers

Zhipu Releases AutoGLM 2.0: Can "Cloud Phones" Enable Fully Automated Agent Operations?

Most Discussed