Gf Securities released a research report stating that at the 2026 GTC conference, Jensen Huang projected the potential market size for AI inference chips could reach $1 trillion by 2027. The narrative of the AI industry is shifting from being training-driven to inference-driven, with advantages in inference cost-effectiveness and stability coming from full-stack AI deployment and software-hardware coordination. The Agent as a Service business model is expected to mature gradually on the enterprise side. NVIDIA's broad collaborations with traditional SaaS companies indicate that AI is not solely about disruption but also about empowerment and co-creation with existing software. The proliferation of Agents creates new demands for computing power, inference models, and security infrastructure. Physical AI applications, such as robotics and autonomous driving, are accelerating their implementation, while world models and simulation training are forming new technological paradigms. Key viewpoints from Gf Securities are as follows:
Inference is becoming the core driver of AI computing power growth. NVIDIA is reconstructing computing infrastructure with its AI Factory system and accelerating the deployment of Agents and Physical AI. The industry narrative is transitioning from "breakthroughs in training performance" to "optimization of inference efficiency." At the 2026 GTC conference, Jensen Huang estimated the potential market for AI inference chips could hit $1 trillion in 2027. Demand for inference computing power is showing structural growth due to two factors: (1) high-quality training data is becoming scarcer, leading model training to rely more on synthetic data and inference during post-training phases; (2) Agent applications consume one to two orders of magnitude more tokens compared to traditional chatbots. Deloitte anticipates the proportion of global inference workloads within AI computing will rise from approximately one-third in 2023 to about two-thirds by 2026, with a long-term potential to exceed 80%. As model API prices continue to decline, with inference costs for mainstream models dropping by over an order of magnitude since 2023, cost efficiency and response speed are becoming critical competitive factors for AI application deployment.
NVIDIA provides comprehensive AI infrastructure through vertical integration of hardware and software. The Vera Rubin computing platform integrates seven chips, including the Vera CPU, Rubin GPU, and NVLink-72 interconnect, significantly boosting inference efficiency via system-level co-design. Inference performance has improved approximately 35-fold compared to the previous generation, with token generation speed achieving a 350x increase. For low-latency scenarios like Agents, NVIDIA introduced the Groq LPU architecture, which uses on-chip SRAM to replace HBM for reduced latency, and launched the LPX rack, offering up to 10x potential commercial benefits.
At the application level, AI Agents and Physical AI represent new growth directions. Agents are expected to become essential for enterprises, with the SaaS model evolving into an Agentic AI as a Service model. NVIDIA introduced NemoClaw to enhance the OpenClaw ecosystem, providing a secure sandbox environment through OpenShell for permissions management and security control. Nemotron-3 Super ranked fourth globally and first among open-source models in the OpenClaw benchmark tests. In the Physical AI domain, NVIDIA unveiled the Cosmos world model, the Isaac simulation framework, and the GR00T robotics foundation model, forming a complete technology stack of "world model + simulation training + robotics model." In autonomous driving, NVIDIA's Robotaxi ecosystem continues to expand, with collaborations like the one with Uber to advance the deployment of autonomous vehicle fleets.
Regarding investment targets, focus on vertical integration of self-developed models, cloud, and ecosystem. Short-term attention is on Google, while medium to long-term prospects include Microsoft (MSFT.US), Alibaba (09988), and TENCENT (00700). For multimodal application scenarios, consider KUAISHOU-W (01024) and MEITU (01357). For IP + AI video, potential targets include CHINA LIT (00772), Chinese Online (300364.SZ), Shanghai Film (601595.SH), Huanyu Century (000892.SZ), Huace Film & TV (300133.SZ), and Zhangyue Technology (603533.SH). For AI marketing, watch MOBVISTA (01860), Yidianzixun (301171.SZ), and BlueFocus (300058.SZ). For AI companion social platforms, consider Kingnet Network (002517.SZ); for AI gaming, look at Heartbeat Company (02400); for AI content rights management, focus on Vobile Group (03738).
Risks include potential underperformance in model iteration, slower-than-expected commercialization and application deployment, and risks related to copyright, ethics, and content quality.