TENCENT's New AI Model Intensifies 3D Capabilities to Bolster Gaming

In the pursuit of Artificial General Intelligence (AGI), AI technology continues to advance towards specialized vertical applications and interactive 3D implementations, with more sophisticated multimodal large models entering the market.

TENCENT, holding the top position globally in gaming, has been consistently focusing its efforts on 3D generation, a area with substantial demand for game development. On April 16, TENCENT officially released and open-sourced its HY-World 2.0 3D World Model.

TENCENT's current 3D model series is divided into two categories. While the Hunyuan 3D Generation Model focuses on creating individual high-precision 3D assets, the Hunyuan 3D World Model is dedicated to constructing complete, interactive 3D scenes that can be imported directly into game engines. The 3D World Model is progressively turning the concept of "AI-generated worlds" into reality, although competition in this field is just beginning. On the same day, Alibaba also released its own world model, Happy Oyster, which emphasizes real-time world creation and interaction.

Generate Game Worlds with a Single Click HY-World 2.0 is a multimodal world model whose core capability lies in understanding various types of input—such as text, images, and videos—and subsequently automatically generating, reconstructing, and simulating 3D worlds. Furthermore, HY-World 2.0 supports the export of multi-format 3D assets (e.g., Mesh, 3DGS, point clouds), enabling seamless integration with existing game development workflows for rapidly generating game maps and level prototypes. In essence, HY-World 2.0 places greater emphasis on practicality by directly producing editable 3D asset files.

Additionally, HY-World 2.0 has achieved a breakthrough in interactivity. The model supports an "Agent Mode," allowing users to control a character to freely explore generated streets, buildings, and environments.

Regarding its technical implementation, the TENCENT Hunyuan team centered its approach on 3D generation. By employing a unified architecture for spatial understanding, generation, and reconstruction, it has achieved state-of-the-art (SOTA) generation results. Reportedly, traditional 3D generation methods often require precise camera parameters to create panoramic views, but obtaining these parameters in practice is extremely challenging. The newly upgraded HY-Pano-2.0 model within HY-World 2.0 adopts an end-to-end implicit learning scheme, enabling the model to autonomously learn the spatial mapping from regular images to 360-degree panoramas, significantly reducing reliance on camera metadata.

After addressing spatial construction, the model also needed to solve the challenge of navigating logically within that space. TENCENT's team developed an in-house spatial Agent technology that combines a Visual Language Model (VLM) with navmesh representation, commonly used in game pathfinding algorithms. This allows the large model to not only understand spatial semantics but also intelligently plan reasonable exploration paths, such as "orbiting an object" or "maximum roaming," ensuring coverage of high-value areas while avoiding issues like walking through walls or moving out of bounds.

Following these planned trajectories, the task of the newly created Novel View Synthesis (NVS) model, HY-WorldStereo, is to ensure that newly generated areas connect seamlessly with existing ones both geometrically and visually, maintaining high spatial consistency so that image quality does not degrade during rapid generation.

As early as November 2024, TENCENT had released and open-sourced version 1.0 of its Hunyuan 3D Generation Model. By last year, version 3.0 of the Hunyuan 3D Generation Model was launched. Concurrently, version 1.0 of the Hunyuan 3D World Model was released in July of last year. Data provided by TENCENT indicates that as of March this year, downloads of the Hunyuan 3D series models in the open-source community surpassed 3 million. Furthermore, the TENCENT Hunyuan 3D creation engine has been integrated by the German software company Maxon into its professional 3D software, Cinema 4D.

The Rationale Behind the Push for 3D Generation Multimodal large models centered around 3D capabilities have been a key focus area for TENCENT in recent years. Information suggests that compared to large language models, TENCENT's AI team has clearly invested more effort into developing multimodal models in recent years.

To enhance the capabilities of its large language model, TENCENT President Martin Lau pointed out during the March earnings call that the company had intensively conducted organizational upgrades and workflow restructuring for the Hunyuan large model team over the past few months. It also re-established the entire infrastructure for pre-training and reinforcement learning while further improving data quality. It was disclosed at the time that Hunyuan 3.0 was in internal testing and would gradually be opened to the public in early April. Now, ahead of the large language model Hunyuan 3.0, its "brother" model, HY-World 2.0, has arrived first.

This sends an important signal: even as TENCENT begins to accelerate improvements to its Hunyuan large language model, it will continue to vigorously advance its multimodal models. TENCENT's emphasis on multimodality and 3D worlds is backed by a clear industrial logic: it is all about synergy with its core businesses, particularly supplying ammunition to its most vital profit engine—the gaming division.

Constructing a complex open-world map or detailed level prototype often requires large art teams to spend months or even years. The emergence of 3D large models directly addresses this pain point. The ability to generate a 3D space, importable into engines like Unreal Engine, in seconds based on a single sentence or a rough sketch, promises significant cost reduction and efficiency gains once fully integrated into internal workflows, potentially revolutionizing game development processes.

It is understood that TENCENT's self-developed, no-code game editor, Light Game Dream Workshop, has already integrated the latest version of the TENCENT Hunyuan 3D generation model. This has created a combined solution featuring "no-code visual programming + a prefab system + a massive resource library + AI generation," forming a user-friendly toolset. Dozens of TENCENT's internal games, including *DreamStar*, have deeply integrated capabilities from the Hunyuan models.

TENCENT management also noted during the March earnings call that the proliferation of productivity-focused AI agents will drive demand for world models like 3D models. This is because AI technology is destined to complement and ultimately enhance Computer-Aided Design (CAD) capabilities, which are crucial in industrial design and architecture, and whose importance in gaming continues to grow. Simultaneously, management believes TENCENT holds a uniquely advantageous position in physical AI and 3D modeling. Leveraging the vast and deep 3D graphics datasets accumulated from its gaming business provides high-quality data for model training, enabling the company to offer relevant 3D tools to the market with a solid foundation to meet demand.

However, despite the grand vision of business empowerment painted by 3D generation technology, it currently faces significant challenges. As one of the most difficult areas within multimodal generation, 3D generation places extremely high demands on computational power and data resources. Increases in duration or dimensionality lead to quadratic rises in computational requirements. Complex geometric calculations and physical simulations keep the inference costs high for large-scale applications.

Moreover, in AAA-grade game production scenarios that demand extreme precision, assets generated by AI often still require substantial manual post-generation refinement, remaining some distance from being truly "out-of-the-box" ready. Large model developers must find a balance between substantial capital expenditure on computing power and practical commercial efficiency.

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Tiger Brokers

TENCENT's New AI Model Intensifies 3D Capabilities to Bolster Gaming

Most Discussed