7 Billion Yuan in the Computing Gap: Who's Paying for Infinigence AI's Middleware Business?

Infinigence AI, a domestic AI infrastructure service provider, announced on May 7th the completion of a new funding round exceeding 7 billion yuan. This round was jointly led by Hangzhou High-tech Golden Investment Group and Huiyuan Capital. The co-investor consortium included Guoxing Capital, Chindata Group, GF Qianhe, Leaguer Qingyan, China Insurance Investment, AEF NextGen, Tenrui Capital, Colorlight, China Securities (CSC) Capital, and Quande Intelligent Learning Laboratory. Existing shareholders Legend Capital, Shanghai Guotou Future, and Yuanzhi Future also participated with additional investments.

Against an industry backdrop characterized by a mismatch between computing supply and demand and a fragmented landscape of underlying chip architectures, this funding round highlights the substantial, yet often underestimated, commercial value of the "AI infrastructure services" sector. The company precisely defines its business model as a "computing power operator," focusing on managing computing resources, with its self-developed software platform serving as the tool for efficient resource allocation.

The core challenge currently facing the large model industry is the "M×N" compatibility problem. This refers to the existence of N different types of AI chips with varying architectures and ecosystems upstream, and M different large model structures downstream. Due to incompatibilities between different hardware's operator libraries and compilation environments, model developers face significant time and research and development costs when migrating computing workloads. Infinigence AI is targeting precisely this gap in "hardware-software decoupling" middleware. Its core products are the Agentic MaaS large model service platform and a suite of co-optimized software and hardware tools.

If large models are likened to "electricity" in modern industry, and chips to "generators," then Infinigence AI aims to play the role of the "substation" and "smart grid." By pooling and virtualizing heterogeneous computing resources, it abstracts away the differences in underlying hardware, providing standardized computing interfaces for upper-layer applications. Its revenue streams are based on Token throughput, computing power leasing, and fees for private deployment services.

Founded in May 2023, coinciding with the broader AI boom, Infinigence AI has attracted capital at a remarkable pace due to its clear business strategy. Including this latest round, the company's total funding has reached nearly 22 billion yuan. The list of investors in this over 7 billion yuan round reveals a strong logic of industrial synergy. On one hand, the lead investment by Hangzhou High-tech Golden Investment Group signals a clear intent to build out industrial infrastructure. As regions across China rapidly construct intelligent computing centers, local governments require not just physical data centers but also technical operators capable of maximizing the utilization of these heterogeneous computing assets.

Simultaneously, strategic alignment with upstream and downstream partners is evident, with co-investors including Chindata Group (an IDC data center service provider) and Colorlight (a video and image display control hardware provider). This indicates that Infinigence AI's business scope is extending towards physical infrastructure and end-point hardware, attempting to forge more robust partnerships within the computing supply chain.

The concentrated capital interest is also attributable to the company's solid technical foundation. Its core team originates from the NICS-EFC laboratory of Tsinghua University's Department of Electronic Engineering. The founding duo, Wang Yu and Xia Lixue, a professor-student combination, possess extensive experience and data accumulation in deep learning software-hardware co-optimization, EDA (Electronic Design Automation), and AI chip architecture.

The significant capital investment in Infinigence AI is predicated on the consistent delivery of its core operational metrics. In terms of execution accuracy, its Agentic MaaS platform achieves an accuracy alignment rate exceeding 99.9% with original manufacturer models when supporting complex toolchains. Regarding computational efficiency, the platform has boosted overall system throughput by 2 to 3 times, reduced system latency by 50%, and strictly controls first-token latency under 500 milliseconds.

Furthermore, the rapid scaling of its business validates the strong market demand for computing power scheduling. At a macro level, by March 2026, China's daily Token call volume had surpassed 140 trillion, representing a growth of over 40% compared to the end of the previous year. Benefiting from this industry tailwind, the daily Token call volume on Infinigence AI's large model service platform had surged more than 20-fold by the end of April this year compared to the end of last year.

This data explosion signifies a fundamental shift in the industry's billing logic. As API price wars for large models approach marginal costs, the market's basis for procuring computing power is shifting from billing based on "GPU rental time" towards a "Token economy" centered on "Token throughput and generation efficiency." Under this new paradigm, Infinigence AI's business model becomes viable. It does not profit from hardware arbitrage but instead, through operator reconstruction and model compression, extracts and delivers a significantly higher-than-average volume of effective Tokens for the same hardware wear-and-tear. This technology optimization premium, derived from maximizing cost reduction and efficiency, forms the underlying financial logic supporting its over 7 billion yuan funding round.

Following this funding, industry focus will shift to how Infinigence AI manages its valuation and further expands its business boundaries. The company has proposed an "AI Productivity Formula," defining it as the product of "Intelligent Scale × Token Production Efficiency × Token Value Conversion." This indicates an ambition to elevate the industry narrative from mere "computing power scheduling" to establishing a "Token Economy" with Chinese characteristics.

In practical commercial deployment, AI infrastructure service providers still face several significant hurdles. The primary challenge comes from the consolidation of ecosystems by hardware manufacturers. As leading companies like NVIDIA continue to strengthen their CUDA ecosystem, and domestic chip giants gradually introduce native integrated hardware-software solutions, third-party middleware providers must demonstrate their indispensability through deeper-level underlying compilation and operator reconstruction to avoid being marginalized or reduced to simple conduits by upstream and downstream players.

A more long-term variable is the decentralization of computing architecture. As computing demands extend from purely cloud-based to edge devices and even physical AI applications, the complexity of computing networks will increase exponentially. Infinigence AI has already begun positioning itself in this area, with its "Boundless" terminal intelligence system aiming to provide an integrated solution of "edge model + edge software + edge IP." Under the stricter power consumption, thermal, and latency constraints of future automotive or robotic applications, the ability to achieve efficient computing resource allocation across cloud, edge, and end-point devices will be the critical test determining whether this high-profile company can truly become the "Android system" of the AI era.

免責聲明：投資有風險，本文並非投資建議，以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請，作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考，不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證，投資者應自行研究並在投資前尋求專業建議。

老虎證券

7 Billion Yuan in the Computing Gap: Who's Paying for Infinigence AI's Middleware Business?

熱議股票