Key Takeaways from Jensen Huang's Wall Street and Internet Addresses This Week

Deep News
Yesterday

Beyond the major GTC conference keynote, NVIDIA CEO Jensen Huang, through an in-depth interview and financial Q&A session, outlined the grand vision of the company at its peak market capitalization. From "$1 trillion order visibility" to "Agents are the future personal computers," Huang is not just selling chips but reshaping the allocation logic of the global IT industry.

Key quotes compiled are as follows:

Regarding Tokens and Value: If your engineer earning $500,000 a year only spends $5,000 on Tokens annually, I would go crazy. If that $500,000 engineer doesn't consume at least $250,000 worth of Tokens, I would be deeply concerned. Even if the chip were free, if it cannot keep up with the state of technology and the speed at which we operate, it still isn't cheap enough.

Regarding Agents and the Future: Every engineer will have 100 agents. In the past, we wrote code; in the future, we will write ideas, architectures, and specifications. Agent systems are the systems that get work done; they are helping our software engineers complete tasks.

Regarding Token Economics: Computers were once just tools; the computers of the future are manufacturing equipment. People buy these computers to produce Tokens, and the efficiency of producing these Tokens is crucial. You are simultaneously buying the most expensive computer and producing the lowest-cost Tokens.

Regarding Market Demand and Growth: We have robust visibility exceeding $1 trillion for demand, orders, and requirements for Blackwell plus Reuben. Our growth rate is actually accelerating. Every software company, every company needs to have an OpenClaw strategy.

Regarding Competition and Architecture: Anyone who says 'my chip is 30% cheaper' only proves they don't understand AI. You just breathe air until you run out. After that, we'll breathe compressed liquid air, but until then, what about the air? It's free, and we've been using it for a long time.

Financial Impact: $1 Trillion Order Backlog Huang revealed a staggering figure during the analyst meeting: NVIDIA's visibility into order demand for the Blackwell and Reuben architectures has surpassed $1 trillion.

The Logic Behind Growth: This number is not an exaggeration but is based on confirmed purchase orders and factory production pipelines. Huang emphasized that NVIDIA's advantage lies in its significantly faster delivery cycles compared to companies developing their own ASIC chips, even achieving "order and ship within the same quarter."

Gross Margin Moat: Addressing concerns about "value being extracted by NVIDIA," Huang stated directly: "TSMC's wafers are the most expensive in the world, but they offer the highest value, so I'm happy to pay." He believes customers are not buying expensive computers but rather the world's lowest-cost Token production capability.

The Third Inflection Point: From Large Models to "Agents" Huang believes AI has progressed through generative and reasoning stages and is now at the third inflection point—Agentic Systems.

Tokens Become the New Salary: In the future, when companies hire engineers, they will provide not just laptops but also Token budgets. If an engineer earning $300,000 a year doesn't consume Tokens, they are wasting productivity.

The Birth of the Personal AI Computer: Systems represented by the open-source project OpenClaw are defined by Huang as "the first personal artificial intelligence computer in human history." It possesses memory, scheduling, skills, and APIs, acting as the future operating system for the IT industry.

New Hardware Landscape: The "Marriage" of Vera Rubin and Groq NVIDIA is no longer just a GPU company but an "AI factory" company.

Disaggregated Inference: This is the core of the Dynamo operating system. By breaking down inference tasks, chips with different performance characteristics handle specific roles.

Groq's Role: NVIDIA's acquisition and integration of Groq (LPX series) is not meant to replace GPUs but to utilize its extremely low-latency SRAM architecture for the "final step" of autoregressive inference.

Trinity Architecture: NVIDIA is the only company globally capable of simultaneously optimizing HBM (High Bandwidth Memory), LPDDR5, and SRAM. This "liquid-cooled, rack-scale" full-system delivery makes competitors' single-point chips seem like "frankenstein" solutions.

Physical AI: A $50 Trillion Blue Ocean Huang is particularly optimistic about Physical AI, believing its ultimate scale will surpass digital AI.

Reshaping Traditional Industries: This is a $50 trillion industry that has been largely a technological desert for the past 20 years. From robotic surgery and autonomous driving to smart base stations, Physical AI must operate at the edge while adhering to physical laws.

Robot Explosion in 3-5 Years: Huang predicts that robots will be ubiquitous in the next 3 to 5 years. While China holds significant advantages in hardware supply chains like motors and rare earths, NVIDIA will provide the "brain" (training, simulation, and onboard computers).

Industry Symbiosis: NVIDIA as the "Best Salesperson" for Cloud Providers Addressing the threat of cloud giants developing their own chips, Huang expressed extreme confidence:

Traffic Driver: "AWS, Google, and Microsoft have the largest booths at GTC because they want to sell their services to my CUDA developers." NVIDIA funnels developers to the cloud through the CUDA ecosystem, essentially acting as a customer acquisition engine for cloud service providers.

Irreplaceability: 40% of business comes from non-major cloud provider sectors (regional clouds, enterprise private clouds). These customers buy the "full-stack platform," not just "chips." Without NVIDIA's full-system solutions, these markets would be inaccessible.

The following is the full text of two interviews, assisted by AI translation:

**All-In Podcast Interview**

Host (Jason Calacanis): This is a special episode; we made an exception from our regular schedule. We usually only make exceptions for three people: President Trump, Jesus, and Jensen Huang. You can rank them yourself. Jensen, your year has been incredible, and this event is amazing. Every industry, every tech company, every AI company is here.

Host (Chamath Palihapitiya): One of the biggest announcements last year was NVIDIA's acquisition of Groq. When you decided to buy Groq, did you realize how insufferable Chamath would become?

Jensen Huang: I had a feeling (laughs). We are his friends, dealing with him every week. I knew what you had to go through with him during those six weeks of closing.

Jensen Huang: Actually, much of our strategy was already publicly disclosed at GTC. Two and a half years ago, I introduced the AI factory operating system, called Dynamo. As you know, Dynamo was invented by Siemens to convert hydraulic energy into electrical energy; it powered the last industrial revolution. I thought it was the perfect name for the "factory operating system" of the next industrial revolution. Inside Dynamo, the core technology is Disaggregated Inference. Jason, I know you're strong technically; I'll let you explain it to the audience.

Host (Jason Calacanis): Thank you for not just taking over. Disaggregated inference means the inference processing pipeline is extremely complex; it's the most complex computing problem today. The scale is massive, involving mathematical operations of all shapes and sizes. The idea is to break apart the processing, running some parts on certain GPUs and the rest on different GPUs. This led us to realize that even disaggregated computing makes sense. This thinking guided our acquisition of Mellanox. Today, NVIDIA's computing is distributed across GPUs, CPUs, switches, and network processors. Now we've added Groq; we will place the right workload on the right chip. We have evolved from a GPU company to an AI factory company.

Host (Chamath Palihapitiya): You mentioned on stage that high-value inference users should pay attention to this. You said 25% of data center space should be allocated to this combination of Groq LPUs and GPUs. Can you tell us how the industry views this new form of "prefill-decode disaggregated" computing?

Jensen Huang: Stepping back, when we added this technology, we were moving from "large language model processing" to "Agentic Processing." When you run an agent, you access working memory, long-term memory, use tools, and place huge pressure on storage. Agents also collaborate with each other. Therefore, data centers host various types of models. We created the Vera Rubin architecture to handle these incredibly diverse workloads. NVIDIA's total addressable market (TAM) has increased by 33% to 50% as a result. A significant portion of this will be storage processors (BlueField), Groq processors, CPUs, and network processors. Together, these constitute the computer for the AI revolution, which is the "agent."

Host (David Friedberg): What about embedded applications? For example, if my daughter's teddy bear at home wants to talk to her. Will it have custom chips inside, or will there be a broader set of tools developed for the edge?

Jensen Huang: At a large scale, we believe this problem involves three computers: The first is for training AI models. The second is for evaluation. For example, robots and autonomous vehicles must be evaluated in a virtual laboratory (Omniverse) that complies with physical laws. The third is the edge robot computer. It could be a car, a robot, or a small teddy bear. We are also doing something very important: transforming telecommunications base stations into part of the AI infrastructure. This is a $2 trillion industry; future radio base stations, factories, and warehouses will become extensions of AI.

Host (Brad Gerstner): Jensen, last year you were ahead of the curve globally, predicting that "inference won't just grow 1,000x." Now, is it going to grow 1 million times or 1 billion times? People thought you were exaggerating then because everyone was focused on training. But now inference demand has exploded. Some say your inference factories cost $40-50 billion, while custom chip (ASIC) solutions cost only $25-30 billion, suggesting you'll lose market share. What's your view on this? Why would anyone pay double the premium?

Jensen Huang: The core logic is: you shouldn't equate the "price of the factory" with the "cost of the Token." I can prove that a $50 billion factory can produce the lowest-cost Tokens for you. Because our efficiency is 10 times higher. In the $50 billion budget, $20 billion is for land, power, and buildings—costs incurred regardless of the chip used. The remaining difference is not significant in the overall cost. But if my data center throughput is 10 times that of others, then even if their chips were free, they couldn't compete with us.

Host (Chamath Palihapitiya): You manage the world's highest-valued company, with revenue potentially exceeding $350 billion next year. How do you decide "what to do"? How do you gain the intuition to know where to double down and where to pull back?

Jensen Huang: That's the CEO's job: to define the vision and strategy. I get inspired by the excellent scientists and technologists in the company, but I must shape the future. My criterion is: Is this thing outrageously difficult? If it's easy, we should back off because there will be many competitors for easy things. I look for things that have never been done before, are extremely difficult, and leverage NVIDIA's "superpowers." We know this brings pain and hardship, but without pain, there is no great invention.

Host (Brad Gerstner): Can you talk about a few long-term businesses? Like space data centers, autonomous driving, or digital biology?

Jensen Huang: Physical AI is a huge category. This is a $50 trillion industry that has been almost a technological desert until now. We started laying the groundwork 10 years ago; now it's at an inflection point, bringing us nearly $10 billion in revenue annually. Digital biology is having its "ChatGPT moment." We are beginning to understand how to represent genes, proteins, and cells. Within the next 5 years, the healthcare industry will undergo massive changes. Agriculture is the same.

Host (Jason Calacanis): NVIDIA started with gamers and enthusiasts. You mentioned the "agent" revolution, especially open-source agents on the desktop. What does this mean for you?

Jensen Huang: There have been three inflection points in the last two years: Generative AI: ChatGPT made everyone aware of AI's existence. Reasoning: Models like O1 enabled AI not just to answer questions but to reason. Agents: Initially enterprise tools like Claude Code, but OpenClaw brings AI agents to the masses. OpenClaw is important because it redefines the computing model. It has memory, a file system, skills, resource scheduling, and APIs. This is essentially the first personal artificial intelligence computer in human history. It's open-source and can run anywhere.

Host (Brad Gerstner): Does this paradigm shift make current AI regulation legislation irrelevant? How should politicians respond to such rapid change?

Jensen Huang: We need to stand before decision-makers and inform them of the truth about the technology: it is not biology, not aliens, has no consciousness; it is just computer software. We cannot let "doomsday theories" influence policy. As a nation, the biggest security risk is not AI itself, but when we stall due to fear while other countries adopt the technology. I'm concerned about the pace of AI adoption in the United States.

Host (Brad Gerstner): Regarding the previous controversy between Anthropic and the Department of Defense, if you were on their board, what advice would you give them to change public fear?

Jensen Huang: Anthropic's technology is excellent; we are also a major customer. Warning about the potential of technology is good, but "warning" is not the same as "frightening." As technology leaders, we must speak more carefully and humbly. Spreading "catastrophic" statements without any evidence causes more harm than people think. Now that technology is so crucial to social structure and national security, our words are paramount.

Host (Brad Gerstner): US public support for AI is only 17%. We once destroyed the nuclear energy industry due to fear; now China is building 100 nuclear reactors, and we are building zero. Back to agents, how is productivity improvement within NVIDIA? Everyone talks about ROI. Do you think we'll see revenue grow exponentially like intelligence?

Jensen Huang: Look at the audience in this venue; 99% of things here are AI. From generative to reasoning, computing increased 100x; from reasoning to agents, computing increased another 100x. Computing has increased 10,000x in two years. People will pay for "information," but they are more willing to pay for "work." Chatbots are nice, but agents that help me complete tasks are truly valuable. Agents are helping our engineers get work done. We are not yet at true mass scale; future growth will be on the order of millions of times.

Host (Jason Calacanis): NVIDIA has 43,000 employees, 38,000 of whom are engineers. If one of your engineers earning $500,000 a year only spends $5,000 on Tokens annually, what would you think?

Jensen Huang: I would go crazy! If a $500,000-a-year engineer doesn't consume at least $250,000 worth of Tokens a year, I would be very worried. It's like a chip designer saying he only needs paper and pen, no CAD tools. This is a paradigm shift, like LeBron James spending $1 million a year on his body. We are giving knowledge workers "superpowers."

Host (Jason Calacanis): In the next two to three years, what will the productivity of these "all-star" employees look like?

Jensen Huang: Thoughts like "this is too hard" or "this takes too long" will disappear. Just like after the industrial revolution, no one says "that building is too heavy." Gravity and scale are no longer problems; all that remains is creativity. In the past we wrote code; in the future we will write "ideas," architectures, and specifications. Every engineer in the future will manage 100 agents.

Host (David Friedberg): Last Sunday night, I spent 90 minutes using an agent to replace an entire software architecture that would have required many people. This sense of acceleration is unprecedented.

Jensen Huang: This is why OpenClaw is so incredible. Some think the enterprise software industry will be destroyed, but my view is the opposite: this industry was previously limited by "the number of employees at their desks." In the future, there will be 100 times more agents hitting SQL databases, Photoshop, and Blender. These tools are the last means for humans to control the outcome.

Host (Chamath Palihapitiya): Regarding open source, some open-source models from China are very powerful. Do you think the endgame for AI is decentralized?

Jensen Huang: We need both "models as a product (private)" and "models as open source" to coexist. For most consumers, I don't want to fine-tune models myself; I'll use ChatGPT or Claude. But for vertical industries, they need to control their domain knowledge, which must rely on open-source models. NVIDIA is also strongly supporting the open-source ecosystem because it allows startups to have world-class vertical capabilities from day one.

Host (Brad Gerstner): President Trump wants US industry to lead and US AI to go global. Currently, NVIDIA's market share in certain markets (like China) has dropped from 95% to 0%. What's the situation now?

Jensen Huang: President Trump wants us back in the fight. We have applied for and received licenses for many Chinese companies that requested purchases. We are restarting the supply chain for shipments. If the US cannot lead the AI technology stack (from chips to systems), it is a major national security loss. I hope the US computing stack accounts for 90% of the global share.

Host (Jason Calacanis): Global conflicts, the Taiwan situation, even helium supply in the Middle East—do these supply chain risks worry you?

Jensen Huang: In the Middle East, we have 6,000 families there; we support them 100%, we stay in Israel 100%. Regarding Taiwan, there are three things we must do: Achieve US domestic industrialization as quickly as possible (like the factory in Arizona). Diversify the supply chain (Korea, Japan, Europe). Maintain patience and restraint.

Host (Jason Calacanis): In autonomous driving, your strategy is an open-source platform like Android, while Tesla is more like iOS. How do you view this chess game?

Jensen Huang: Our goal is to enable every car company in the world to make autonomous vehicles. We provide three computers (training, simulation, onboard) and have developed the safest operating system. Even customers with strong in-house capabilities like Musk and Tesla buy our training computers. We are happy to provide solutions at any level; we are not here to replace anyone but to solve problems.

Host (Brad Gerstner): But major customers like Google and Amazon are also developing their own chips (TPU, Trainium). They are both customers and competitors. How do you handle this?

Jensen Huang: We are the only AI company in the world that partners with every other AI company. I don't look at what they are developing, but I show them everything I develop. Confidence comes from two points: Buying NVIDIA products remains the most economical choice currently. We are the only cross-platform architecture (cloud, on-premise, automotive, even space). Many don't realize that 40% of NVIDIA's business comes from building complete AI infrastructure, not just selling chips. In fact, NVIDIA's market share is increasing because Anthropic, Meta are using NVIDIA, and the explosion of open-source models runs best on NVIDIA.

Host (Brad Gerstner): Analysts seem skeptical of your growth potential. They forecast 30% growth next year, 20% the year after, and only 7% by 2029. They think the "law of large numbers" will constrain you.

Jensen Huang: They simply don't understand the scale and breadth of AI. In the past, the data center CPU market was only $25 billion a year, but our scale now is completely different. NVIDIA is not doing chips; we are doing AI infrastructure, and this market is much larger than people imagine.

Host (Chamath Palihapitiya): Talk about space data centers?

Jensen Huang: We are already in space. The challenge is heat dissipation (only through radiation), but it's not unsolvable. Currently, our chips are installed on satellites processing imagery. Instead of sending data back to Earth, it's better to process it directly in space.

Host (Jason Calacanis): In healthcare, how will AI make a real impact?

Jensen Huang: Three directions: AI Biology: Predicting biological behavior, aiding drug discovery. AI Agents: Assisting doctors in diagnosis. Physical AI / Robotic Surgery: Future ultrasound, CT instruments will have built-in agents.

Host (Jason Calacanis): Robotics had "two lost decades," but now Musk's Optimus and robots in China are developing rapidly. How far are we from "robot chefs" or "robot butlers"?

Jensen Huang: About 3 to 5 years. China is very strong in the hardware ecosystem like motors, rare earths, magnets; the global robotics industry will rely on that supply chain. Ultimately, robotics will be the biggest driver of human prosperity. It not only solves labor shortages but allows everyone to start their own business through robots. We could even "inhabit" robotic dogs through virtual reality to chat with our children or walk the dog while traveling.

Host (Brad Gerstner): Anthropic's CEO predicts AI models and agents will generate $1 trillion in revenue by 2030. What do you think?

Jensen Huang: I think he's too conservative. Anthropic's performance will far exceed that number. Because every enterprise software company will become a "value-added reseller" for these AI models in the future.

Host (Brad Gerstner): What is the "moat" for these companies?

Jensen Huang: Deep specialization. Don't just build a generic horizontal platform; deeply cultivate a vertical domain, infusing your specialized knowledge into agents. Whoever connects with customers first gains the data flywheel.

Host (Jason Calacanis): Three years ago you said: "You won't be replaced by AI, but by people using AI." Has your view on employment changed now?

Jensen Huang: I am not a doomsayer. While drivers might decrease, "mobility assistants" will increase. Just like autopilot on planes increased the number of pilots. My advice to young people: become experts at using AI. This requires artistry, knowing how to guide AI without overly constraining it.

Host (Brad Gerstner): You once advised young people at Stanford to experience "pain and hardship." What do you advise them to learn now? Would an English major be more promising than computer science?

Jensen Huang: Deep science, math, and language skills remain crucial. Because language is the programming language for AI. Look at the example of radiology: 10 years ago, some predicted AI would make radiologists obsolete. What happened? The demand for radiologists surged. Because AI made scanning faster, hospitals could handle more patients, increasing revenue. A faster-growing country needs more teachers, more specialists, but each of them will have AI-granted "superpowers."

Host (Jason Calacanis): Jensen, congratulations on your achievements. This has been a very positive and inspiring conversation.

Jensen Huang: Thank you. We don't need to panic; we have the autonomy to choose how to create the future.

**NVIDIA GTC26 Financial Analyst Q&A**

Host: Good morning everyone. I hope you enjoyed yesterday's presentation. Although it was a bit long, I think it was a perfect summary. Now we'll turn the time over to you, focusing on your needs and questions. I'll first hand the time to Jensen (Huang).

Jensen Huang: As I mentioned yesterday, AI has recently experienced three inflection points: the first was generative AI, the second was reasoning, and we are now at the third inflection point—Agentic Systems. These systems have "autonomous action capability," hence they are called agents. You can set goals for them; they no longer just answer questions but execute tasks. One of the most popular applications is writing software. In your companies, and in mine, engineers use agent systems all day. In the past, engineers got a laptop when they joined; now they get a laptop and Tokens. Token budgets have become a tangible thing. If you hire an engineer for $300,000 a year but they don't consume any Tokens in their work, you have to ask what they are doing all day. Future computers are no longer just tools but production equipment. Like ASML's lithography machines, they produce sellable products. This is no different from the generators (Dynamos) that produced electricity long ago. Energy efficiency and production efficiency determine your revenue. Every software company, every enterprise now needs an "OpenClaw strategy," just like we had to have a Linux strategy, an internet strategy, and a mobile cloud strategy.

Jensen Huang: I want to update the situation regarding order visibility. A year ago, I said that by 2026, we had strong visibility for $500 billion in shipments for Blackwell and Reuben. Now it's March 2026, and we have strong demand visibility exceeding $1 trillion for the Blackwell plus Reuben architectures. This includes confirmed demand forecasts and purchase orders. Please note, this $1 trillion refers only to Blackwell and Reuben. I am not including Groq, standalone CPUs, or other new products, as I want to compare with last year's data. Moreover, this number will continue to grow through the end of 2027. We have inventory and supply pipelines, and can even fulfill orders and ship within the same quarter. This is something companies making ASICs cannot do because their delivery cycles are too long.

Jensen Huang: Last year (2025) was the "year of reasoning." We made everyone understand: the link between the price of a computer and the cost of a Token is small. People buy computers to produce Tokens. You buy an expensive computer, but it produces Tokens extremely fast, resulting in you having the lowest-cost Tokens. This is also why we can maintain gross margins. Each generation of our products offers higher value—namely, "Tokens produced per second, per watt." Customers would rather buy the next-generation product at a higher price than buy the old product at a lower price. Installing Vera Rubin is smarter than continuing to buy Grace Blackwell because the value is higher.

Jensen Huang: In 2025, we also expanded platform support. Anthropic and Meta SL became new partners. Current data from API inference service providers shows that open models have become the second most popular AI model category globally, after OpenAI. And NVIDIA is the best platform globally for running open models. We also work closely with cloud service providers (CSPs). We have CUDA on their clouds, which attracts all developers. We are the best sales force for CSPs, which is why you see AWS, Google, Microsoft, and Oracle with the largest booths in our exhibition hall—they want to sell their services to our developers. Furthermore, 40% of our business comes from non-CSP sectors, like regional clouds, industrial enterprises, etc. Without NVIDIA's full-stack platform, you simply cannot reach this 40% of the market because they buy the "platform," not just the "chip."

**Q&A Session**

Questioner (Amelius Research, Ben Wright): Jensen, the biggest concern is thrust: Is this investment worth it? Can the revenue growth of cloud providers cover these huge expenses? When will we see their revenue estimates revised upwards?

Jensen Huang: I wish those AI companies were already public so you could see what I see. Never in history have startups been able to add $1 or $2 billion in revenue per week like they are now. The $2 trillion IT software industry is consolidating around OpenAI, Anthropic, and open models. The future IT industry will become "resellers" of these models. I estimate the IT industry will grow from the current $2 trillion to $8 trillion. All IT companies will lease or produce Tokens in the future. Their business models will shift from software licensing to Token leasing. While this introduces cost of goods sold (COGS), the value provided is much higher. The revenue growth speed of OpenAI and Anthropic is like "growing a full IT company within a month."

Questioner (Caner Fitzgerald, CJ Muse): What kind of change will Physical AI bring to your business?

Jensen Huang: Currently, the growth rates of digital AI and physical AI are similar. But in a few years, physical AI will hit an inflection point; it must run locally, at the factory edge. Since 70% of the global $70 trillion industry involves physical atoms (not digital bits), physical AI will eventually account for 70% of our business. Future computers will run 24/7. I hope an engineer costing $2,000 a day spends a $1,000 Token budget daily. I want him to manage an entire fleet of agents working for him.

Questioner (Bernstein, Stacy Rasgon): Will Reuben be released with Groq? How is the inference workload evolving?

Colette Kress: LPX (related to Groq) is expected in the second half of this year.

Jensen Huang: Vera Rubin will ship earlier than Groq. Regarding compute architecture, there's low latency (CPU) and high throughput (GPU). Groq is an extreme low-latency architecture; almost the entire chip is SRAM. It's not flexible, but very fast. We integrate Groq with Vera Rubin, using Groq for the final stage of language model autoregressive inference. For free or standard-tier inference, Vera Rubin is unbeatable. But for extremely high-end, very smart models, adding Groq significantly boosts throughput. It's like the iPhone or automotive industry; as the market expands, stratification occurs: from free tier to geek tier (a high-end tier costing $50 per million Tokens).

Questioner (Bank of America, Vivek Arya): In the $1 trillion market, what is the proportion of CPUs, storage, etc.? Will Groq erode demand for HBM?

Jensen Huang: We are the only company that can optimize across HBM, LPDDR5, and SRAM. If the $1 trillion order included Groq, the scale would be $1.25 trillion. Storage is the second largest expense; CPUs are about 5%. Vera Rubin addresses the compute needs of "agents"—it must not only reason but also check memory, use tools, run browsers. We harmoniously integrate all these functions into a liquid-cooled rack architecture; it's no longer a "frankenstein" solution.

Questioner (Goldman Sachs, Jim Schneider): Token costs keep declining; will this flatten out?

Jensen Huang: Token costs will continue to decline. But simultaneously, the "smartness" per Token will continue to rise. Evaluating an AI factory must consider "Tokens produced per watt." Any comparison not divided by power consumption is misleading. We will continuously push the Pareto Frontier—making the factory produce more, smarter Tokens at the same cost. This is the hardest problem in computer science.

Questioner (Evercore ISI, Mark Leatis): What do SSM (State Space Models) and the hybrid architecture of Neimotron 3 mean for you?

Jensen Huang: The beauty of NVIDIA's architecture is that it supports everything: Transformers, diffusion models, SSMs, etc. Groq can't run diffusion models, but we can. Neimotron 3 aims to handle ultra-long contexts. We want to advance AI technology, not just compete.

Questioner (UBS, Tim Aruri): Some worry that NVIDIA takes too much value from the ecosystem, and gross margins are unsustainable. What's your view?

Jensen Huang: If you continue to deliver multi-fold productivity improvements, customers will be happy to partner with you. It's like TSMC's wafers are the most expensive globally but offer the highest value, so I'm happy to pay. The same goes for ASML. Those who say "my chip is 30% cheaper" simply don't understand AI. They don't understand the overall economics of an AI factory.

Questioner (Redburn, Tim Schultzander): Your employee headcount growth is slow, but task volume growth is extremely fast. How do you balance this?

Jensen Huang: I have 60 direct reports. Our company structure reflects our product architecture. To update the product generation annually, you must fully own the entire software stack, storage, and network. You can't achieve annual updates by piecing together others' technologies. We own everything from the chip to the operating system (Dynamo), allowing us to run old software perfectly on brand-new systems from day one.

Questioner (Final Questioner): How will training demand evolve?

Jensen Huang: Training has progressed from pre-training (kindergarten) to post-training (learning skills). Post-training requires reinforcement learning, tool use, etc., and its computational intensity could be a million times that of pre-training. Future pre-training will primarily use "synthetic data." I hope 99% of future compute power will be used for inference because inference is the process of converting Tokens into economic benefit. This is why NVIDIA went all-in on inference last year. Inference is "thinking" and "working"; how could it be easy? Inference will only become increasingly difficult.

Jensen Huang: Thank you all for coming to GTC.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10