Global Top 2, Domestic No.1! DingTalk AI Achieves Major Breakthrough, Outperforms OpenAI and Claude in DeepResearch Benchmark

DingTalk's AI research system "Dingtalk-DeepResearch" has achieved a breakthrough in international authoritative evaluations, scoring 48.49 points in the DeepResearch Bench test, ranking second globally and first domestically, surpassing mainstream systems like OpenAI and Claude. The system has been successfully applied in complex scenarios such as manufacturing and supply chains, demonstrating industry-leading accuracy and robustness in handling heterogeneous tables, multi-stage reasoning, and multimodal generation tasks. This advancement marks a dual breakthrough in both international benchmarks and real-world production applications, positioning Chinese enterprise AI technology in the global first tier.

The core of Dingtalk-DeepResearch lies in its multi-agent deep research framework designed for real enterprise scenarios, integrating deep research generation, heterogeneous table parsing and reasoning, and multimodal report generation into a single system. This design mimics a team of specialists working collaboratively—some analyzing tabular data, others generating reports, and others coordinating tool usage. Through a three-layer architecture (task-oriented agent layer, core engine layer, and data layer), the system supports parallel processing and multi-stage reasoning for complex tasks. For example, it can automatically parse factory production tables with nested and merged cells and transform them into structured, insightful analysis reports.

To adapt to dynamic enterprise environments, the framework features an entropy-guided, memory-aware online learning mechanism, enabling continuous evolution without human intervention. This allows the system to learn from historical interactions and gradually adapt to different business processes and user preferences. For instance, when users repeatedly modify AI-generated report formats, the system autonomously learns and memorizes their preferences for subsequent outputs. These personalized preferences can be shared across teams or entire organizations, enhancing knowledge reuse and efficiency.

To ensure output quality, Dingtalk-DeepResearch incorporates the DingAutoEvaluator assessment system, which conducts multi-dimensional "quality checks" on generated reports, covering data accuracy, logical coherence, and tool usage standards. If issues are detected, the system automatically feeds them back into the training process for model optimization, forming a closed-loop improvement cycle.

Currently, Dingtalk-DeepResearch has been stably deployed in real-world business scenarios, delivering value. In supply chain management, it rapidly analyzes cross-departmental complex tabular data to provide intelligent procurement recommendations. In manufacturing, it converts raw equipment data into visual analysis reports for predictive maintenance and decision-making. All core functionalities have been validated through international benchmark tests, ensuring reliability and technological leadership.

DingTalk's CTO Zhu Hong stated, "Dingtalk-DeepResearch combines adaptive optimization and multimodal reasoning to create a flexible, enterprise-grade AI framework capable of handling complex and evolving real-world tasks. This technology is accelerating deployment in AI search, AI tables, automated workflows, and Agent platforms, bringing cutting-edge AI closer to practical production needs and delivering tangible value to enterprises."

免责声明：投资有风险，本文并非投资建议，以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请，作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考，不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证，投资者应自行研究并在投资前寻求专业建议。

老虎证券

Global Top 2, Domestic No.1! DingTalk AI Achieves Major Breakthrough, Outperforms OpenAI and Claude in DeepResearch Benchmark

热议股票