After the Rise of AI Agents, the Entire AI Value Chain Has Been Redefined!Morgan Stanley: GPUs Are No Longer Everything

The core narrative of AI investment is undergoing a profound and irreversible structural shift.

Over the past two years, the market has fixated on the logic that “whoever owns more GPUs holds the future”. However, as AI evolves from content generation to autonomous task execution, the core industry constraint is shifting from computing power shortages to system efficiency gaps. Accordingly, investment logic has expanded from single-chip computing races to full-stack system engineering.

According to Trading Desk quotes, Shawn Kim, analyst at Morgan Stanley Research, explicitly stated in the report: “Agentic AI marks a structural shift from computation to orchestration.”

Within agentic AI workflows, CPU-led orchestration accounts for 50% to 90% of total latency. This transition is projected to create an incremental CPU market of USD 32.5 billion to USD 60 billion by 2030, lifting the total server CPU TAM to USD 82.5 billion–110 billion. Agentic deployment will also drive an additional 15 to 45 EB of new DRAM demand by 2030, equivalent to 26% to 77% of the industry’s total annual supply in 2027.

Meanwhile, segments including DRAM, ABF substrates, wafer foundry, storage, connectors and passive components are evolving from secondary enablers into new bottlenecks and profit pools.

This shift carries clear market implications: beneficiaries of AI capital expenditure will expand from a handful of chip giants to the global industrial chain. The next wave of excess returns will likely concentrate on enabling segments that become structural bottlenecks in agent workflows and face rigid capacity expansion constraints. As bottlenecks migrate across hardware layers, the weight and profit distribution of the AI value chain will be reshaped.

What Exactly Is AI Optimizing?

To fully grasp this research, it is critical to address a fundamental question: what is the core optimization target of modern AI systems?

In the generative AI era, the goal was straightforward — model capability, or the ability to generate higher-quality content. Key metrics included model scale (parameter count), training efficiency (FLOPs), and inference performance (tokens per second).

As a result, GPUs became the absolute core, with NVIDIA emerging as the biggest beneficiary of this cycle.

The rise of Agentic AI has triggered a fundamental shift in objective functions. Systems are no longer limited to content generation but are required to complete closed-loop tasks. Evaluation metrics have therefore shifted from pure capability to operational efficiency: cost per task, system latency, and overall throughput.

This leads to a decisive industry conclusion: GPUs determine technical feasibility, while CPUs and system architecture determine long-term profitability.

Structural Inflection: From Generation to Execution, Bottlenecks Shift from Computing to Orchestration

Traditional generative AI workflows are highly streamlined. Upon user requests, the CPU conducts basic preprocessing, the GPU handles token generation, and results are outputted directly. In this linear chain, the GPU captured the majority of value, with the CPU serving only as an auxiliary component.

Agentic AI operates on a fundamentally different framework. A complete task requires multi-stage iteration, including planning, information retrieval, tool calling, execution, feedback and re-decision-making. It also covers multi-agent collaboration, permission management, state persistence and continuous scheduling. Agentic AI does not increase the computational load of a single inference, but adds massive procedural steps, state management and cross-module coordination — workloads inherently better suited for CPU processing.

In short, the core pain point of AI is evolving from computing constraints to orchestration constraints.

Two tangible industry changes will follow:

First, the industry-wide CPU-to-GPU ratio in data center clusters will rise structurally.

Second, DRAM will be upgraded from a basic capacity option to a core performance component governing system throughput.

Data center bottlenecks will increasingly reside in memory bandwidth, data movement, interconnection latency and system-level coordination, rather than standalone GPU computing power.

CPU Repricing: Ratio Shift from 1:12 to 1:2 and Beyond

The traditional AI server benchmark was defined as one CPU supporting twelve GPUs.

The report highlights that lengthened agent workflows, complex tool invocation and advanced context management are rapidly reversing this ratio.

Based on NVIDIA’s technology roadmap, the CPU-to-GPU ratio has approached 1:2 on the Rubin platform. For advanced architectures such as Rubin Ultra, a reversed configuration of two CPUs per single GPU may become mainstream. Even a moderate adjustment from 1:12 to 1:8 will trigger a substantial surge in overall CPU demand for large-scale AI deployment.

If this trend accelerates, CPU demand will no longer be tied solely to server shipment cycles, but driven by the rising complexity of agent logic, delivering sustained structural growth.

CPU TAM Expansion: 2030 Total TAM of USD 82.5B–110B, Driven by Orchestration

Morgan Stanley adopts a layered system framework to separate agent-driven CPU incremental demand from traditional server replacement needs, dividing the market into three segmented categories:

Head Node CPU: Covers rack control layers closely integrated with GPU clusters. Based on an estimated 5 million global AI accelerators in 2030, with two high-end CPUs paired per accelerator and an average selling price of USD 5,000, this segment corresponds to a TAM of approximately USD 50 billion.

Orchestration CPU: Addresses incremental demand brought by agent scheduling, including task planning, toolchains, RAG pipelines, KV cache, vector database memory services, strategy deployment and system observability. An additional 10 million to 15 million orchestration CPUs with an ASP of USD 3,000 will generate a TAM of USD 30 billion–45 billion.

Other CPU: Includes storage nodes and partial network hardware, with a TAM ranging from USD 2.5 billion to USD 15 billion.

Combined, the total server CPU TAM will reach USD 82.5 billion to 110 billion by 2030, among which agentic AI will contribute USD 32.5 billion to 60 billion in incremental market value. The core assumption is anchored by USD 1.2 trillion in global AI data center infrastructure spending by 2030, compared with USD 242 billion in 2025.

The report also outlines upside catalysts. Under NVIDIA’s bullish forecast of USD 3 trillion to 5 trillion in AI infrastructure spending by 2030, the total CPU TAM could expand sharply to USD 206 billion–275 billion, or even USD 344 billion–458 billion. While not the baseline forecast, this scenario reflects the massive multiplier effect of large-scale AI factory expansion on CPU consumption.

Memory Upgrade: From Capacity Hardware to Performance Core

If the CPU acts as the system’s command hub, DRAM functions as AI’s core operating space.

Agentic architectures require real-time storage and rapid access to massive state data, including contextual information, KV cache, intermediate tool-calling data and multi-task concurrent datasets.

Accordingly, DRAM is no longer a marginal capacity accessory, but a core hardware component that directly determines system throughput and response speed.

Estimates show agentic AI will add 15 to 45 EB of new DRAM demand by 2030, accounting for 26% to 77% of global annual memory supply in 2027. This structural surge will transform the memory sector from a highly cyclical industry into a high-growth structural track. Manufacturers including SK hynix and Samsung Electronics will benefit from stabilized profit visibility.

Furthermore, memory has become one of the most durable monetization layers across the AI stack. On-premise DRAM, memory interface chips, CXL expansion and tiered storage architectures will all capture long-term industrial value.

Tight Capacity Equals Pricing Power: ABF Substrates, Foundry & Enabling Components

Beyond CPUs and DRAM, the most attractive excess return opportunities lie in enabling segments with prolonged production cycles and lengthy verification lead times.

ABF Substrates: The AI-driven ABF upcycle will extend to the end of the decade, with structural supply-demand gaps emerging around 2026–2027. The expansion of CPU TAM alone is expected to lift ABF demand by 5%–10% in 2030. The server CPU ABF substrate market will reach USD 4.7 billion by 2030, with CPU-driven incremental demand contributing USD 1.2 billion.

Wafer Foundry (Advanced Process): The CPU foundry market will reach USD 33 billion in 2026 and USD 37 billion in 2028. TSMC’s CPU manufacturing market share is projected to rise from 70% in 2026 to 75% in 2028, while Intel is expected to outsource server CPU production to TSMC starting in the second half of 2027.

BMC & Memory Interconnect: Aspeed is identified as the leading beneficiary of server BMC chips, holding a 70% market share in this niche; its new-generation AST2700 platform will drive a 40%–50% ASP increase. Montage dominates the memory interconnect value chain with a global revenue share of 36.8%.

CPU Sockets & Passive Components: Lotes and FIT are key suppliers of CPU sockets. Calculations indicate every 1 million additional CPUs will lift Lotes’ revenue by 0.6% and FIT’s revenue by 0.2%. For passive components, based on an average MLCC consumption of USD 30 per general server, agentic AI will generate an additional USD 500 million in MLCC demand by 2030, accounting for 2%–3% of the global MLCC market.

These components are critical links in the AI data flow chain. Once they become bottlenecks, they will gain strong pricing power. Packaging and substrate manufacturers such as Samsung Electro-Mechanics are typical beneficiaries of this logic.

One core rule defines the next AI cycle:

AI profits will continuously flow to segments with the slowest capacity expansion.

Market Mispricing & Investment Timeline: From Concentration to Diversification

Although industrial fundamentals have shifted dramatically, capital markets remain anchored in the GPU-centric narrative. This misalignment means institutional capital will gradually rotate from over-concentrated computing power assets to diversified infrastructure segments.

The AI industrial evolution can be divided into three clear stages:

GPU-Dominated Phase (Completed): Computing power shortage as the core contradiction
System Bottleneck Phase (Ongoing): Latency and cost constraints become prominent
Infrastructure Repricing Phase (Upcoming): Broad-based upside for memory, CPUs and high-speed interconnection

In this new cycle, excess returns will no longer be monopolized by individual chip leaders, but distributed across the entire system supply chain.

CPU represents the most visible incremental driver, yet enabling segments are more favored by the market.

The report acknowledges that rising agentic AI workloads will structurally boost AMD’s cloud market share, though it maintains Equal-weight ratings on both AMD and Intel. It prefers to capture the agentic AI theme through stocks such as NVIDIA and Broadcom, whose capital expenditure and token growth demonstrate a more direct earnings correlation, while also treating valuation constraints as a key consideration.

From a broader macro framework, the core value of this report lies in upgrading the AI investment paradigm from a single-point computing arms race to system efficiency and bottleneck economics. GPUs function as the engine, CPUs as the transmission and control system, and memory plus interconnects as the fuel line and chassis. While extreme performance at individual components remains relevant, scalable returns are ultimately determined by overall system collaboration.

For the industrial chain, this means the sources of excess returns from AI investment will become more diversified and longer-duration. Gains will no longer be limited to the most powerful GPUs, but will increasingly accrue to segments that emerge as early bottlenecks within agentic workflows and face severe capacity expansion constraints. Key trackable high-frequency metrics include the upgrade magnitude of CPU counts and memory configurations in new platform BOMs, the signing pace of long-term cloud vendor contracts, and the utilization trends of ABF substrate and advanced process node capacity.

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券