In the Agent era, large language models are transitioning from being mere "chat tools" to becoming "autonomous employees," according to a research report. Model developers that possess core algorithms and industry interfaces are poised to significantly benefit from the widespread adoption of intelligent systems. Attention is drawn to the "twin stars of large models," MINIMAX-WP (00100) and KNOWLEDGE ATLAS (02513), which successfully listed earlier this year. Serving as the "brain" for native Agent ecosystems, these companies exhibit high scarcity value. Key viewpoints from the report are as follows.
As of February 2, 2026, Clawdbot has garnered over 130,000 stars on the code hosting platform GitHub, with its official website accumulating more than 2 million visits, establishing it as one of the fastest-growing open-source technology projects recently. Concurrently, emerging "AI-only communities" like Moltbook have rapidly amassed millions of agent accounts. Such interactions naturally correspond to higher request densities and more frequent API triggers, with the most direct observable variable being a step-like increase in API call frequency and token throughput.
Following a strong recommendation from Clawdbot founder Peter Steinberger, the M2.1 model from domestic AI unicorn MINIMAX, which excels in long-text processing and logical reasoning, has gained significant popularity. The importance of model unit cost is rising. Under traditional dialogue paradigms, a single interaction requires only a few model calls. However, in workflow paradigms, a single task often spans multiple stages including planning, retrieval, tool invocation, validation, error correction, and writing to external systems. This leads to a multiplication of model call frequency, context length, and the complexity of intermediate information.
Multi-step reasoning and multi-round tool calls inherently generate "multi-turn contexts," while retries and self-correction produce additional invalid tokens. Compared to basic chat, agent services for complex tasks may consume dozens of times more tokens. Consequently, "model unit cost multiplied by unit output" becomes the critical factor determining whether Agent-based products can achieve scale, as multi-round reasoning and tool collaboration linearly amplify costs during task execution. This is precisely why Clawdbot's founder explicitly recommended MINIMAX; the M2.1 model's combination of efficiency, cost advantage, strong long-text capability, and reasoning and programming prowess aligns with current user demands.
The M2.1 model is designed to address the pain point of high token costs faced by developers in automated programming through extreme cost advantages, with a pricing structure approximately 8% of Claude Sonnet's. Furthermore, its Coding Plan innovatively introduces a high-frequency refresh mechanism that resets quotas every 5 hours, breaking away from the industry-standard daily or monthly limits and unleashing productivity in high-frequency, intensive development scenarios. Regarding billing models, instead of adhering to the common pay-as-you-go logic for underlying model tokens, the company employs a tiered monthly subscription system.
In real workflows, evolving contexts typically contain tool calls, historical information, retrieved snippets, constraints, and more. The M2.1 model's long-text capability makes it more suitable for maintaining "continuous memory," enabling it to read longer documents, accommodate more intermediate results, and reduce logic breaks caused by truncation.
In products like Clawdbot, which emphasize automated execution and closed-loop error correction, models are used for writing code, modifying code, making judgments, and performing validations. The M2.1 model's "sufficient and highly cost-effective" reasoning and programming capabilities make it the ideal choice for integration into production systems and high-frequency invocation. In the Agent era, "which model is smarter" is important, but more crucial is "which can transform strong capabilities into frequently usable productivity at a lower cost." This is considered MINIMAX's key advantage.
As Agents enter office and production environments, inputs are no longer primarily pure text but increasingly come from visual information like screenshots, PDFs, tables, charts, and UI elements. In executable workflows like Clawdbot, user inputs include not only structured text but also screenshots, web interfaces, error pop-ups, tables/charts, or PDF pages. MINIMAX's multimodal capabilities assist Agents in better understanding interfaces, extracting key information, outputting executable steps/code, and using screenshot re-reading for validation and correction.
This enables Clawdbot to perform "visually-driven automation," such as automatically filling forms after identifying table fields, locating causes and modifying scripts after reading error screenshots, extracting data from charts and writing it into reports, and comparing before-and-after screenshots to confirm task completion. Leveraging its multimodal abilities, MINIMAX can better complete service closed loops, reduce manual interpretation, enable quick error correction, and achieve stronger deliverability.
Risk提示: Technology roadmaps are subject to uncertainty; industry competition is intensifying.