From Algorithms to Manufacturing Capability: Zhiyuan's Mid-Term Outlook on the Humanoid Robot Sector

Deep News
昨天

By 2026, the atmosphere in the embodied intelligence sector is completely different from two years ago. At a recent humanoid robot and embodied intelligence standardization conference in Beijing, Peng Zhihui, Co-founder of Zhiyuan Robot, offered a clear assessment: "There are now over 140 humanoid robot manufacturers in China, with more than 330 product models released. The industry has officially moved from lab demonstrations and showcasing prototypes into the second half—competition centered on engineering implementation and real-world application." This statement effectively redefines the current phase of the entire industry.

In recent years, what often went viral were videos of humanoid robots. Which robot walked more like a human, ran more steadily, or performed flips and acrobatics determined which gained social media traction and funding surges. That was the era of the "showcase state." However, in this speech, Peng repeatedly emphasized a different term: the "deployment state." "From 2024 to early 2025, the competition was about whose robot walked straighter or more naturally. Now, physical agility has reached a practical level. The next phase is about comparing whose robot has stronger working capabilities. We're not just competing domestically but also against leading international firms to see who can truly achieve a functional 'deployment state.'"

When a technology industry shifts focus from "can it move?" to "can it work?", it transitions from a conceptual phase to an engineering phase. For investors, this signals a change in the risk structure. If the previous race was about algorithms and funding, the competition now revolves around systems engineering capability. Peng began his talk candidly: "The entire embodied intelligence industry is still exploring collectively; no single company has all the answers. We need to do the right thing at the right time." While this sounds humble, it implies a key judgment—the technology window has opened, the era of single-point breakthroughs is over, and the era of systems integration has begun.

Why now? Peng's answer was straightforward: "The fundamental reason is the breakthrough brought by AI technology development." He divided the last decade of AI evolution into three stages: perceptual intelligence from deep learning, cognitive intelligence from large models, and the current stage driven by the integration of AI and robotics creating a physically intelligent world. "We have achieved the scaling of digital AI in recent years; now the challenge is scaling physical AI, moving from the digital world to the physical world." This is a longer and more difficult path. In the digital world, "you can restart if code fails"; but in the physical world, "there are physical costs and failure costs." A single fall for a robot could mean hardware damage and cash flow consumption.

Consequently, he proposed an engineering paradigm: "One Body, Three Intelligences." The "One Body" refers to the robot's physical form. "The body is the constrained interface for AI in the real world," Peng emphasized. "The real physical world is full of friction, collisions, deformation, errors, aging, and noise. Body design isn't simple hardware stacking; it's a synthesis of reliability engineering, supply chain engineering, and safety engineering." For capital markets, this reframes the sector's logic from an "algorithm story" back to "manufacturing capability."

He broke down the core components clearly: "Currently, the two most critical components are joints, which determine the upper limit of mobility, and dexterous hands, which determine the upper limit of operational capability. These two components account for the majority of the robot's total cost." This is essentially a cost structure map. In the early industry, actuator technologies varied widely, with hydraulic, linear drive, and high-speed, high-stiffness solutions coexisting. But "starting around 2023, solutions began converging towards new types of joints." He even drew an analogy: "Humanoid robot hardware technology is very similar to new energy vehicles, centering on a core 'three-electric system'." The difference lies in complexity. Vehicle motors operate under relatively simple conditions, whereas robots "require high-dynamic, high-frequency forward and reverse rotation, with dozens to over a hundred degrees of freedom throughout the body." Specifications vary immensely between different joints; the torque requirements for a finger and a thigh are entirely different magnitudes. "Designing specifications for each joint individually would be a disaster for mass production," Peng stated frankly.

Zhiyuan's solution is serialization and standardization. "We have consolidated five major series, nearly 10 products, into 8 standardized joint designs. These 8 joints are used across all products, meeting the needs of all body parts. This is the benefit standardization brings." When a company starts talking about "serialized planning" instead of single-product performance, it is preparing for scale.

The challenge for dexterous hands is more complex. "On one hand, you need to fit 10 to 20 degrees of freedom into a space smaller than a human hand; on the other, it demands extremely high-dimensional perception, especially tactile sensation," he said. Peng offered an intuitive judgment: "Close to 80% of the tasks humans perform well but traditional automation struggles with are strongly related to tactile sensation." The experience of an assembly worker judging success by a "click" sound—how to digitize this experience—is an industry bottleneck. "For vision, we first had standard sensors, then standard data formats, then standard datasets, and finally an algorithm explosion," Peng noted. "But for tactile sensors, the technological path hasn't converged, and there are no standards yet." For investors, this implies that once tactile sensation is standardized, it will create a new technological and cost inflection point.

If the "Body" is the physique, the "Three Intelligences" are the soul. Locomotion intelligence has advanced rapidly in recent years. Peng summarized the reasons: "First, the algorithm paradigm shifted from model-driven to reinforcement learning; second, the proliferation of simulation frameworks enabled large-scale parallel training; third, joint technology convergence reduced control difficulty." The combined benefits have significantly enhanced dynamic performance. But locomotion is just the foundation. Peng pointed out that "interaction intelligence provides emotional value, while task intelligence provides productivity value." Interaction intelligence heavily leverages large model achievements, but "future robots must not only understand voice commands; they need to see your emotions, understand your tone, and even anticipate your intent." In his view, emotional value is "more significant than many imagine," which is why robot performances at events like the Spring Festival Gala spark widespread discussion.

What truly determines commercial value is task intelligence. To lower the training barrier, Zhiyuan launched the "Lingchuang Platform." "We've simplified the action training process to be like posting a TikTok video—upload a clip, and the platform automatically handles key point detection, motion transfer, training, and deployment," Peng explained. This signifies the industry's shift from a "development state" for researchers to a "creation state" involving the public, ultimately achieving a low-cost "deployment state."

Deployment was the core theme throughout the speech. Discussing scenario selection, he proposed a "pick the low-hanging fruit" strategy. "We categorize tasks by scene complexity and task complexity. Scene complexity is a constraint that doesn't reflect value; task complexity is what demonstrates value." Autonomous driving involves simple tasks in complex environments, whereas humanoid robots are currently better suited for "complex tasks in simple environments," such as high-degree-of-freedom operations in structured factory settings. "Ultimately, both autonomous driving and embodied intelligence will evolve towards performing complex tasks in complex environments," Peng said. "But for now, we must choose a realistically feasible path." This reflects phased pragmatism rather than end-game idealism.

In the latter part of his talk, Peng used a vivid analogy to clarify the logic behind the humanoid form factor. "Computer Use is the humanoid interface for the digital world, while humanoid robots are the universal interface for the physical world." Theoretically, having AI generate underlying code might be more efficient, but the real-world software ecosystem is designed for mouse and keyboard interaction, making interface operation the most universal path. Similarly, real-world elements like door handle height, stair dimensions, and tool shapes are designed for the human body. "Since the environment is built around humans, for AI to achieve maximum universality and compatibility, the terminal form will most likely need to resemble a human," Peng stated. "It may not be the most efficient form, but it offers the strongest compatibility." This statement addresses a common question in capital markets: Why pursue a humanoid shape? Because it's an interface.

Finally, he characterized the industry's mid-term phase as being about "infrastructure, not single-point products." "The key to scaling physical AI lies in data closed loops, reliability engineering, and the standardization of operational capabilities," Peng said. "We need to move fast, but also steadily." When a robot company starts emphasizing data governance, evaluation systems, operational experience, and co-building standards, its perspective shifts from product launches to building an industrial system.

For investors, 2026 might not be the breakout year, but rather the year of differentiation. The showcase state ends; the deployment state begins. The premium for hype recedes, while pricing based on engineering capability rises. Metrics like joint yield rates, serialization planning, real-world operational hours, and data feedback efficiency will gradually replace stage performance difficulty as the core valuation drivers. Moving from digital AI to physical AI represents a more prolonged industrial migration. Peng concluded, "Standardization is not just a technical specification; it is an accelerator for industrial implementation." When an industry starts talking about standards, it signals preparation for scale. And scale is never achieved by proclamation; it is built slowly, joint by joint, hand by hand, through supply chains and operational systems.

免责声明:投资有风险,本文并非投资建议,以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请,作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考,不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证,投资者应自行研究并在投资前寻求专业建议。

热议股票

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10