Xiaomi Claims "Mystery Model" That Sparked Global Developer Buzz, AI Roadmap Gains Clarity

A "mystery model" once widely speculated to be DeepSeek V4 has been officially claimed by Xiaomi. This move not only resolved the identity puzzle within the developer community but also provided capital markets with a clearer reference point for understanding how Xiaomi's AI investments are materializing.

Recently, an AI model named Hunter Alpha was anonymously launched on the developer platform OpenRouter, capturing the attention of the global developer community. Launched on March 11 as a "hidden model," Hunter Alpha boasted a scale of over 1 trillion parameters and a 100,000-token context window. Its specifications closely matched the rumored details of DeepSeek V4, leading to widespread speculation.

Xiaomi put an end to days of conjecture with an official announcement. In the early hours of March 19, Xiaomi unveiled a trio of updates to its MiMo large model series. These included the flagship foundation model MiMo-V2-Pro, the full-modal Agent model MiMo-V2-Omni, and the speech synthesis model MiMo-V2-TTS. Subsequently, Xiaomi founder Lei Jun stated on Weibo that the company had just released the trillion-parameter model Mimo-V2-Pro. He disclosed that Xiaomi's "actual progress in the AI field may be much faster than what everyone has seen," and revealed that AI R&D and capital expenditure for the year will exceed 16 billion yuan.

In a research report dated March 19, Goldman Sachs noted that this concentrated release of three flagship models signifies Xiaomi's transition from the AI R&D investment phase to the results realization phase. The firm stated that Xiaomi's market positioning as a "Physical AI leader" is gradually gaining substantial support. Goldman Sachs maintained its "Buy" rating on Xiaomi with a 12-month target price of HK$41, implying approximately 14% upside from the current share price.

The three Xiaomi models made a significant debut. Xiaomi MiMo-V2-Pro is designed for high-intensity Agent work scenarios in the real world. It features over 1T total parameters (42B activated parameters) and supports a 1M ultra-long context length. In its underlying architecture design, it inherits a hybrid attention mechanism and significantly increases the hybrid ratio from 5:1 to 7:1, balancing massive scale with high inference efficiency.

MiMo-V2-Omni is positioned as Xiaomi's full-modal foundation model, integrating multimodal understanding capabilities for images, video, and audio with powerful agent capabilities. According to the Goldman Sachs report, this model meets or exceeds the levels of Gemini 3 Pro, Claude Opus 4.6, Gemini 3, and GPT-5.2 on several core metrics, including audio understanding, image understanding, video understanding, and agent capability.

MiMo-V2-TTS, targeting the era of voice agents, offers highly controllable multi-granularity style control, natural rhythm reproduction, and singing capabilities. Goldman Sachs pointed out that its next goals include expanding language coverage beyond Chinese and English and deeply integrating with the multimodal understanding capabilities of MiMo-V2-Omni, enabling agents to describe the real world with a voice approaching human expressiveness.

All three models have been integrated into WPS Office, the miclaw agent system on Xiaomi phones and computers, and the Xiaomi browser.

Regarding flagship model performance, MiMo-V2-Pro is the core of this release. The model possesses over 1 trillion total parameters, 42 billion active parameters, and a 1 million token context window. It ranks eighth globally and second among Chinese models on the Artificial Analysis Intelligence Index global large model comprehensive intelligence leaderboard, surpassing xAI Grok and trailing only Gemini 3.1 Pro Preview, GPT-5.4, GPT-5.3 Codex, Claude Opus 4.6, Claude Sonnet 4.6, and GLM-5.

Goldman Sachs highlighted cost efficiency as another core competitive advantage of MiMo-V2-Pro. According to Artificial Analysis data, the cost to run this model for the Intelligence Index test was $348, which is 36% lower than GLM-5, ranked higher on the same list, and 90% lower than Claude Sonnet 4.6. Compared to Claude Opus 4.6 and Claude Sonnet 4.6, MiMo-V2-Pro's token usage cost is up to 80% lower.

In terms of training efficiency, the ARL-Tangram system recently launched by Xiaomi has been deployed to support the training of the MiMo series models. This system has achieved an average action completion time improvement of 4.3 times, a reinforcement learning training speed increase of up to 1.5 times, and external resource savings of up to 71%.

It is worth noting that MiMo-V2-Pro is deeply optimized for Agent scenarios. It has undergone SFT & RL for complex and diverse Agent Scaffolds, equipping it with stronger tool invocation and multi-step reasoning capabilities. On the OpenClaw standard evaluation benchmarks PinchBench and ClawEval, MiMo-V2-Pro's performance is among the top globally. Furthermore, with its 1M ultra-long context window, MiMo-V2-Pro can comfortably support high-intensity, real-world complex application flows.

The AI roadmap is becoming clearer, showcasing a systematic layout from models to ecosystem. The Goldman Sachs report emphasized that this release is not an isolated event but part of Xiaomi's systematic push to accelerate the conversion of AI R&D investment into tangible results.

Prior to this, Xiaomi released the vision-language-action model Xiaomi-Robotics-0 for robot reasoning and real-time execution in February of this year, launched the AI agent system miclaw in March, and simultaneously introduced an upgraded advanced driver-assistance system (HAD) equipped with the XLA cognitive model on March 19. This model integrates Xiaomi's self-developed cross-embodied foundation model, MiMo-Embodied.

Goldman Sachs outlined clear iterative directions for the three models: the next goal for MiMo-V2-Pro is to tackle high-complexity reasoning and long-cycle task planning; MiMo-V2-Omni aims to achieve sustained intention planning across hours or even days, real-time stream perception, and execution of actions via robots and hands; MiMo-V2-TTS will expand into multiple languages and deepen its integration with Omni.

Goldman Sachs believes that with leading multimodal AI capabilities and a wealth of agent application scenarios within its "Human x Car x Home" ecosystem, Xiaomi has the potential to capture a significant share of the global AI model market. Simultaneously, it is poised to create high-premium, differentiated consumer AI terminals.

Regarding increased investment and valuation logic, short-term profits face pressure, but long-term value is being reassessed. Goldman Sachs expects Xiaomi's R&D expenditure to reach 40 billion yuan in 2026, higher than the 2025 estimate of 32.2 billion yuan. This sustained increase in investment will weigh on near-term profits. Financial data shows that Goldman Sachs forecasts Xiaomi's 2026 net profit (before special items) to be approximately 27.9 billion yuan, lower than the 2025 estimate of 39.5 billion yuan, corresponding to a 2026 price-to-earnings ratio of about 27.4 times.

Nevertheless, Goldman Sachs argues that the continuous delivery of results should prompt the market to reprice Xiaomi as a Physical AI leader with self-developed AI, operating systems, and chip capabilities, rather than valuing it solely on near-term P/E multiples. The firm maintains its Buy rating and HK$41 target price, based on a sum-of-the-parts valuation methodology. This includes applying a 16x target 12-month forward EV/NOPAT multiple to Xiaomi's core business, using a DCF valuation ($45 billion) for the electric vehicle business, and applying a 10% conglomerate discount.

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券

Xiaomi Claims "Mystery Model" That Sparked Global Developer Buzz, AI Roadmap Gains Clarity

热议股票