AI Enters "2.0" Era: OpenAI's $300 Billion Inference Power Play

According to sources familiar with the matter, OpenAI has agreed to pay over $200 billion to chip startup Cerebras over the next three years to access servers powered by the latter's chips. Under the agreement, the developer of ChatGPT could also acquire an equity stake in the company. This development comes as OpenAI strives to maintain its lead in the artificial intelligence race and meet surging demand. Earlier this year in January, the company agreed to purchase up to 750 megawatts of computing capacity from Cerebras over a three-year period, a deal valued at over $100 billion. The commitment amount disclosed by sources now exceeds the previously reported agreement between OpenAI and the chipmaker. This transaction underscores the industry's growing need for computational power required for "inference"—the process where AI models generate responses. Companies are currently racing to develop inference models and applications aimed at accelerating broader AI adoption. Reports indicate that Cerebras, headquartered in Sunnyvale, California, may disclose details of the previously undisclosed agreement with OpenAI as early as Friday. The agreement grants OpenAI warrants for a minority stake in Cerebras, with the potential for the shareholding to increase alongside rising expenditures. Additionally, OpenAI has agreed to provide Cerebras with approximately $10 billion to help fund the construction of data centers needed to operate its AI products. It is further noted that OpenAI's total expenditure over the next three years could reach $300 billion, which might translate into warrants representing up to a 10% equity stake in Cerebras. The explosion in inference demand marks a shift in the AI industry's focus. In the early stages of AI development, industry attention was largely concentrated on "training." However, with OpenAI's massive $300 billion compute agreement with Cerebras, a clear signal has been sent: the competitive focus is shifting from "how to make models smarter" to "how to make intelligence affordable." The main battlefield for computing power is now migrating en masse towards "inference." Industry data projects that by 2026, the incremental share of computing power driven by inference will reach two-thirds, and is expected to exceed 80% in the future. The growth rate is equally staggering. Calculations based on the latest data from OpenRouter show that in just one week in early April, global AI large model calls reached 27 trillion tokens, an increase of 18.9% from the previous week. Within this, Chinese AI large model weekly calls hit 12.96 trillion tokens, surpassing the United States for the fifth consecutive week. Simultaneously, the barrier to inference is rapidly falling. According to the Stanford 2025 AI Index Report, the cost of achieving inference performance equivalent to GPT-3.5 has dropped 280-fold within two years. With demand exploding and costs plummeting, these two forces combined are paving the way for the large-scale application of AI. Training costs represent fixed, predictable capital expenditure. But as user bases surpass hundreds of millions, every ChatGPT response and every video generated incurs real-time operational expenses. These operational costs rise linearly with user growth. Experts predict that over 90% of the computing power expenditure in the future AI industry will occur during the inference stage. For companies like OpenAI, failing to minimize the cost per inference could render their business model's moat extremely fragile. Training requires strong general-purpose performance, whereas inference prioritizes energy efficiency and latency. This creates significant opportunities for startups like Cerebras and Groq, as well as for major cloud providers developing their own chips, such as Google's TPU and AWS Inferentia.

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券

AI Enters "2.0" Era: OpenAI's $300 Billion Inference Power Play

热议股票