A significant barrier in the field of embodied intelligence is gradually being dismantled. On February 10, Alibaba's DAMO Academy officially released RynnBrain, a foundational model for an embodied intelligent brain, and open-sourced the entire series of seven models in one go. This release includes the industry's first 30B Mixture-of-Experts (MoE) architecture.
This move holds considerable milestone significance. According to reports, RynnBrain is the first to endow robots with spatiotemporal memory and spatial reasoning capabilities. It simultaneously set new records (State-of-the-Art) on 16 embodied intelligence open-source evaluation benchmarks, surpassing top-tier industry models like Google's Gemini Robotics ER 1.5.
This indicates that the longstanding constraints of "spatiotemporal amnesia" and "physical illusions" in embodied intelligence are being actively addressed. The robotic brain is expected to evolve from a simple command receptor into an intelligent entity capable of deep environmental understanding.
For a long time, the intelligence level of embodied models has been a major bottleneck hindering the generalization of robots. The shortcoming in generalization ability, in particular, has greatly limited their application in complex physical scenarios.
To break through this bottleneck, the industry has developed multiple technical exploration paths. One path focuses on Vision-Language-Action (VLA) models, which can directly manipulate the physical world. However, due to the scarcity of high-quality robotic data, achieving cross-scenario generalization is extremely difficult. Another path introduces brain models like Vision-Language Models (VLM), which have generalization potential. Yet, these models generally lack memory capacity, have limited dynamic cognition, and often suffer from physical illusions, making it hard to support complex mobile manipulations for humanoid robots.
This technological wall, caused by deficiencies in the intelligence architecture, means that even seemingly advanced robots struggle with complex mobile operations.
Alibaba DAMO Academy's RynnBrain model was born precisely to tear down this wall from its foundational logic. It is reported that RynnBrain innovatively introduces two core capabilities: spatiotemporal memory and physical world reasoning. These are two fundamental abilities required for deep interaction between robots and their environment.
Spatiotemporal memory refers to the robot's ability to locate objects within a complete historical memory, backtrack to target areas, and even predict motion trajectories, thereby granting the robot a global capability for spatiotemporal backtracking.
Physical spatial reasoning differs from traditional pure-text reasoning paradigms. RynnBrain adopts an interleaved reasoning strategy combining text and spatial localization, ensuring its reasoning process is firmly grounded in the physical environment and significantly reducing hallucination issues.
For example, a robot running RynnBrain, if interrupted while performing Task A and asked to perform Task B first, can accurately remember the temporal and spatial state of Task A. After completing Task B, it can seamlessly resume Task A. This "brained" memory mechanism solves the long-standing problem of "instantaneous amnesia" in embodied intelligence.
Furthermore, RynnBrain was trained based on Qwen3-VL and deeply optimized using DAMO Academy's self-developed RynnScale architecture. This achieved double the training speed under equivalent computing resources, with the training dataset exceeding 20 million pairs.
This efficient training system is directly reflected in the evaluation results. RynnBrain comprehensively set new industry records across 16 key tasks, including environmental perception, object reasoning, egocentric visual question answering, spatial reasoning, and trajectory prediction. This achievement is not merely about computational power but represents a successful reconstruction of the underlying architecture for embodied intelligence.
RynnBrain also boasts good scalability, enabling the rapid post-training of various embodied models for navigation, planning, and action, potentially becoming a foundational model for the embodied intelligence industry.
On the path to building a foundational model for the embodied intelligence industry, DAMO Academy has chosen the route of open source. The academy open-sourced the entire RynnBrain series, totaling seven models. These include full-scale foundational models and post-training specialized models. Among them is the industry's first 30B embodied model with an MoE architecture, which can surpass the performance of industry 72B models while requiring only 3B activated inference parameters, thereby enabling faster and smoother robot movements.
Concurrently, DAMO Academy also open-sourced a new evaluation benchmark, RynnBrain-Bench, designed for evaluating spatiotemporal fine-grained embodied tasks, filling a gap in the industry.
Behind this large-scale open-sourcing by Alibaba DAMO Academy lies a broader industry ambition: to accelerate the construction of an open and evolvable embodied intelligence ecosystem.
From a global technology competition perspective, embodied intelligence is at a critical inflection point, transitioning from the "digital virtual" to the "physical entity."
Zhao Deli, head of the Embodied Intelligence Lab at DAMO Academy, pointed out that RynnBrain has, for the first time, achieved a brain's deep understanding and reliable planning of the physical world, marking a key step towards general embodied intelligence under a hierarchical cerebrum-cerebellum architecture. The expectation is that it will accelerate the process of AI moving from the digital world into real-world physical scenarios.
In 2017, on the 18th anniversary of Alibaba's founding, Jack Ma established DAMO Academy, dedicated to addressing technology and R&D issues that promote productivity. At that time, Ant Group committed to investing 100 billion yuan into DAMO Academy over three years.
However, over the past three years, against the backdrop of major organizational changes within the Alibaba Group, DAMO Academy has also undergone repeated adjustments and reshuffles. Its previously diverse "4+X" research areas have now been streamlined to primarily "Intelligence + Computing." The intelligence direction includes medical AI, decision intelligence, video technology, embodied intelligence, and genomic intelligence, while the computing direction encompasses computing technology and RISC-V, among others.
Embodied intelligence is clearly one of the key investment areas for DAMO Academy today. It is understood that in embodied intelligence, DAMO Academy is building deployable, scalable, and evolvable embodied intelligence systems. It has already open-sourced models like WorldVLA, which integrates world models and VLA models, the world understanding model RynnEC, as well as the industry's first robot context protocol, RynnRCP.
While DAMO Academy focuses on embodied intelligence, the global humanoid robot market is also reaching a critical juncture for scaled development. The year 2025 marks the starting point for the large-scale development of the global humanoid robot market.
According to IDC data, the global shipment of humanoid robots reached nearly 18,000 units last year, a year-on-year increase of approximately 508%, with sales amounting to about $440 million. During the same period, cumulative sales orders are estimated to have exceeded 35,000 units.
Although this field still faces challenges such as scarce real-world physical feedback data, generalization in unstructured environments, and deep hardware-software integration, the open-sourcing of RynnBrain undoubtedly provides global developers with a relatively mature "brain template." This is conducive to accelerating the industrialization of embodied intelligence.
For the industry, this is not just a release of code but a redistribution of technological capability. When top-tier models are no longer secret weapons confined to the labs of giants, the embodied intelligence industry will enter a new cycle of accelerated iteration and collective evolution.