Evaluating AI's Economic Viability Starts with GPU Lifespan

The debate over whether the current AI boom constitutes a bubble has raged for the past two years. However, before repeatedly questioning whether the spending is justified, a fundamental premise has been largely assumed: that the total cost is a relatively fixed figure. A recent report from Goldman Sachs suggests this premise may be flawed.

Goldman Sachs Global Investment Research recently published a report titled "Tracking Trillions." Using NVIDIA's projected data center revenue as an anchor, the report estimates a baseline for global cumulative AI infrastructure capital expenditure from 2026 to 2031: approximately $7.6 trillion. This includes about $5.1 trillion for compute chips, $2.15 trillion for data centers, and $358 billion for power. The annual figure is projected to be around $765 billion in 2026, rising to $1.64 trillion by 2031.

However, the report's core argument is not about whether $7.6 trillion is high or low, but rather about the inherent fragility of this number. The market typically discusses AI capital expenditure (CapEx) as a demand-side issue: Can AI commercialization justify these investments? Goldman Sachs argues that supply-side uncertainties are equally significant and severely underestimated. The final cost of building this infrastructure hinges on a small set of rarely debated assumptions. Altering any one of them could shift the projected figures by trillions of dollars.

**How Many Years Should a Chip Be Depreciated?** The report identifies four key assumptions that most significantly impact the total cost. The foremost is the economic useful life of an AI chip, a variable sparking intense debate on Wall Street. Currently, hyperscale operators typically depreciate GPU servers over a 4 to 6-year period. However, NVIDIA has shifted to an annual product release cadence: Hopper (2022), Blackwell (2024), Rubin (2026), Rubin Ultra (2027). Each generation represents an order-of-magnitude leap in efficiency and performance, not incremental improvement. This makes a 5 to 6-year depreciation cycle increasingly difficult to justify economically.

Goldman's sensitivity analysis shows that shortening the chip's useful life from 5 years to 3 years would increase the implied total annual depreciation from 2026-2031 from about $3 trillion to nearly $4 trillion. Conversely, extending it to 7 years would reduce it to $2.2 trillion. Adjusting this single parameter can shift the depreciation burden on the ecosystem by hundreds of billions of dollars.

Well-known investor Michael Burry cited this as his core argument when publicly shorting NVIDIA and Palantir in the second half of 2025. He estimated that from 2026 to 2028, hyperscalers would cumulatively understate depreciation by about $176 billion due to overestimating chip lifespan, thereby inflating profits by over 20%. His judgment is that the true economic life of chips is closer to 2-3 years, and current accounting practices are merely a form of earnings management.

The actions of major companies reveal interesting divergences. Amazon shortened the depreciation period for some servers from 6 to 5 years in early 2025, absorbing an operating profit impact of about $700 million, and recorded $920 million in accelerated depreciation in Q4 2024 for a batch of early-retired equipment. Microsoft's CEO, Satya Nadella, publicly stated the company is intentionally staggering its procurement of different chip generations to avoid carrying a four-to-five-year depreciation burden on a single product line. Conversely, Meta extended the useful life of its servers three times within three years, with the latest in January 2025 converting reduced depreciation into a $2.9 billion quarterly profit boost, coinciding with Amazon's move to shorten its cycle.

CoreWeave's CEO provided counter-evidence: the company's A100 chips purchased in 2020 are still running at full capacity, and a batch of H100 chips at the end of their contract were immediately re-leased at 95% of their original price. The Goldman report also acknowledges that older chips retain economic value in lower-sensitivity scenarios like inference, edge computing, and synthetic data generation, suggesting a tiered deployment model could support longer lifespans.

The essence of the debate is not a technical issue but a profit and loss statement issue. The depreciation period determines how much cost is amortized each year, thereby influencing the book return on this trillion-dollar gamble.

**Data Centers Are Evolving into Something Else** The second critical assumption is data center construction cost. Goldman's baseline assumption is $15 million per megawatt. However, the report notes this figure faces upward pressure. Traditional cloud data centers cost about $10 million per megawatt. AI-era data centers are fundamentally different: rack power density has surged from 5-15 kW in the past to 130-200 kW in the Blackwell era, and over 500 kW in the Rubin era; cooling has shifted from air to full liquid; and compute, memory, networking, and power must be co-designed, not independently stacked.

NVIDIA's Vera Rubin platform unveiled at GTC 2026 pushed this pressure to a new extreme. The NVL72 rack packs 72 Rubin GPUs and 36 Vera CPUs into a standard 42U cabinet, consuming power equivalent to 40 U.S. households, requiring direct liquid cooling with 45°C inlet water and 800V DC power—requirements most existing facilities cannot meet. Future configurations like NVL576 point towards 600 kW per rack.

Goldman's sensitivity analysis shows that adjusting the data center cost from $15 million to $19 million per megawatt would increase the six-year cumulative data center CapEx from $2.15 trillion to $2.72 trillion, an increment exceeding $570 billion. Furthermore, while buildings and power facilities are typically designed for 20-25 years, when the technical requirements of a facility may change fundamentally within two years of operation, the "durability" of long-lived assets becomes a risk.

The report also highlights an awkward reality: "transitional AI data centers" built less than two years ago may already be unable to meet the power and cooling demands of the next-generation chips. When a data center is designed for a 20-year life, but its technical needs may be obsolete within two years, longevity itself becomes a liability.

**Will Savings Be Reinvested or Reduce Spending?** The third assumption involves chip architecture choices. Beyond GPUs, an increasing share of compute will be delivered via ASICs (Application-Specific Integrated Circuits): Google's TPU, AWS's Trainium, Meta's MTIA, and OpenAI's custom chips developed with Broadcom. These chips offer lower cost and power consumption per unit of effective compute for specific tasks.

Recent contracts illustrate the scale: Anthropic announced in October 2025 a deal to purchase up to 1 million TPUs from Google, a deal valued in the "hundreds of billions of dollars." In April 2026, this partnership expanded to 5 GW of TPU capacity with a $40 billion Google investment. Broadcom's AI ASIC revenue for fiscal 2025 was approximately $20 billion, with a backlog of $73 billion. Morgan Stanley raised its 2027 TPU shipment estimate to 5 million units and 7 million for 2028.

But the report poses a crucial question: Will these cheaper compute options ultimately reduce the total build-out scale, or will they simply be absorbed by new use cases? This is framed as a variable: the elasticity of compute demand. In one scenario, demand is relatively fixed. Organizations know the size of the model they need to train or the number of users to serve. Cheaper chips directly shrink the capital pool, and architecture choices tangibly change the total. In another scenario, demand follows price. As compute becomes cheaper, teams will train larger models, run longer contexts, and deploy AI into more previously marginal use cases—effectively spending the savings again. The total infrastructure scale remains unchanged; what changes is who captures the profit in between.

The report notes that NVIDIA's data center GPU gross margin is around 75%, far exceeding other chip suppliers. Multiplying trillions of dollars by 75%, this profit pool becomes a powerful motivator for hyperscalers to develop their own chips. The critical question is whether this motivation leads to "spending less" or "consuming more"—the answers are entirely different. The report's baseline currently leans towards the latter. In a phase where compute demand is far from saturated, cheaper computation tends to催生 more usage, not less investment. Architectural changes alter value distribution, not the total size of the pie.

The authors acknowledge this judgment is time-bound. When the proportion of inference load increases, margin pressures mount, and the return on marginal compute begins to diminish, cheaper chips could indeed start compressing total expenditure. But that stage has not yet arrived.

**Bottlenecks Don't Change Cost, But May Erode Confidence** The fourth assumption is the lengthening of construction cycles. Queues for power grid connections, approval processes, shortages of specialized labor, and extended lead times for components like transformers, cooling equipment, and GPUs (now 36 to 52 weeks) are widening the gap between capital investment and operational capacity.

The lengthening itself doesn't change the unit cost. The price of power, the cost per megawatt for the data center, and chip efficiency remain the same. It operates through a different mechanism: by extending timelines and increasing coordination complexity, it exposes all parties—those bound by take-or-pay contracts, credit providers, and operators relying on secondary market financing—to prolonged uncertainty.

Goldman Sachs believes that in a baseline scenario, bottlenecks merely slow the deployment pace rather than reduce the total volume. Project delays and capital duplication (most notably building private power generation to bypass grid queues) result in a less efficient but similarly sized construction process. However, when bottlenecks are severe and persistent enough, the narrative can shift from the supply side to the demand side. When numerous projects are simultaneously stalled, market focus shifts from "how do we build this" to "should we be building this much at all?" The report terms this a feedback loop: supply-side friction feeds back into demand-side doubt.

The report judges the current environment is closer to the baseline scenario, but the buffer is thin. The combined 2026 CapEx guidance from the top five hyperscalers has climbed to around $700 billion (based on analyst consensus), more than tripling from just over $200 billion in 2024. Capital intensity has reached 45%-57% of revenue, resembling utility companies more than tech firms. In 2025 alone, these companies raised over $108 billion through bond markets, with projected issuance of $1.5 trillion in the coming years. At this level of leverage, execution delays can easily translate into demand-side skepticism.

**Factors That Don't Change the Total, and a Circular Paradox** The report also lists factors that seem important but have limited impact on the total scale. The shifting ratio between training and inference affects the payback speed, not the total infrastructure required. Severe fluctuations in memory prices are essentially a reflection of supply-demand imbalance under extreme procurement volumes; Goldman expects similar short-term shocks to recur in other areas like optical interconnects and packaging. Building private power does raise per-project costs, but power accounts for only about $358 billion, or less than 5%, of the six-year cumulative investment. Even widespread adoption would not significantly alter the overall $7.6 trillion figure. These variables determine who makes money and when, not the total amount to be spent.

The report candidly admits its analysis is built on a circular logic: If the build-out succeeds, infrastructure expands, bottlenecks are resolved, and compute prices continue to fall, the result may not be oversupply. Instead, it could very well activate a new wave of demand and new use cases at lower price points. The very infrastructure deemed sufficient for today's AI ambitions is precisely what makes it insufficient for tomorrow's technological opportunities.

免責聲明：投資有風險，本文並非投資建議，以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請，作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考，不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證，投資者應自行研究並在投資前尋求專業建議。

老虎證券

Evaluating AI's Economic Viability Starts with GPU Lifespan

熱議股票