A critical bottleneck in artificial intelligence development has shifted from computing power to memory, according to a warning from industry leaders. During the training and inference processes of large AI models, graphics processing units (GPUs) must exchange tens of terabytes of data per second with memory systems—a situation likened to a high-performance sports car being constrained by a narrow fuel line. Currently, only three companies worldwide possess the capability for mass production of high-bandwidth memory (HBM) required for advanced AI applications, with production capacity fully booked through 2028. This has created a severe imbalance between the expansion of computing power and memory capacity.
The memory bottleneck creates systemic challenges across three dimensions. In the short term, it leads to inefficient utilization of computing resources, driving up the costs of model training and inference. Over the medium term, it will test the limits of chip manufacturing capacity and the semiconductor industrial ecosystem's carrying capacity. Looking long-term, the constraint challenges the supporting infrastructure including power supply, cooling systems, materials science, and interconnection capabilities.
The competition in artificial intelligence has evolved beyond algorithmic superiority to become a test of national industrial ecosystems—encompassing computing power, memory manufacturing, energy resources, and infrastructure. The future ceiling for AI advancement will not be determined by occasional model breakthroughs but by the deep industrial foundation required to sustain complex systems' stable operation. While models and frameworks will continue to iterate, falling behind in fundamental capabilities such as chips, memory, and materials could require extended periods to recover. The ultimate form of AI represents a reliably operating intelligent industrial machine rather than merely a sophisticated standalone model.