Microsoft (MSFT.US) and NVIDIA (NVDA.US) Set New AI Inference Record at 1.1 Million Tokens per Second

Stock News
11/04

Microsoft (MSFT.US) announced that its Azure ND GB300v6 virtual machine achieved a new industry benchmark of 1.1 million tokens per second in inference speed on Meta's Llama2 70B model. The Azure ND GB300v6 utilizes NVIDIA's (NVDA.US) Blackwell Ultra GPU, specifically the NVIDIA GB300NVL72 system, which features 72 NVIDIA Blackwell Ultra GPUs and 36 NVIDIA Grace CPUs in a single-node architecture. Optimized for inference workloads, this virtual machine delivers a 50% improvement in GPU memory and a 16% increase in thermal design power (TDP).

Microsoft CEO Satya Nadella stated on social media, "This milestone reflects our long-standing collaboration with NVIDIA and expertise in deploying AI at production scale." To validate the performance gains, Microsoft tested the Llama2 70B (FP4 precision) model across 18 ND GB300v6 virtual machines under a single NVIDIA GB300NVL72 domain, using NVIDIA TensorRT-LLM as the inference engine. The company confirmed that one NVL72 rack of Azure ND GB300v6 achieved a total inference speed of 1.1 million tokens per second, surpassing its previous record of 865,000 tokens per second on the NVIDIA GB200NVL72 platform.

Russ Fellows, Vice President of Labs at Signal65, noted, "This breakthrough not only crosses the million-tokens-per-second threshold but also does so on a platform capable of meeting modern enterprises' dynamic usage and data governance needs." He added that the Azure ND GB300 offers a 27% improvement in inference performance over the previous-generation NVIDIA GB200 while requiring only a 17% increase in power specifications.

免责声明:投资有风险,本文并非投资建议,以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请,作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考,不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证,投资者应自行研究并在投资前寻求专业建议。

热议股票

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10