Global Top 2, Domestic No.1! DingTalk AI Achieves Major Breakthrough, Outperforms OpenAI and Claude in DeepResearch Benchmark

DingTalk's AI research system "Dingtalk-DeepResearch" has achieved a breakthrough in international authoritative evaluations, scoring 48.49 points in the DeepResearch Bench test, ranking second globally and first domestically, surpassing mainstream systems like OpenAI and Claude. The system has been successfully applied in complex scenarios such as manufacturing and supply chains, demonstrating industry-leading accuracy and robustness in handling heterogeneous tables, multi-stage reasoning, and multimodal generation tasks. This advancement marks a dual breakthrough in both international benchmarks and real-world production applications, positioning Chinese enterprise AI technology in the global first tier.

The core of Dingtalk-DeepResearch lies in its multi-agent deep research framework designed for real enterprise scenarios, integrating deep research generation, heterogeneous table parsing and reasoning, and multimodal report generation into a single system. This design mimics a team of specialists working collaboratively—some analyzing tabular data, others generating reports, and others coordinating tool usage. Through a three-layer architecture (task-oriented agent layer, core engine layer, and data layer), the system supports parallel processing and multi-stage reasoning for complex tasks. For example, it can automatically parse factory production tables with nested and merged cells and transform them into structured, insightful analysis reports.

To adapt to dynamic enterprise environments, the framework features an entropy-guided, memory-aware online learning mechanism, enabling continuous evolution without human intervention. This allows the system to learn from historical interactions and gradually adapt to different business processes and user preferences. For instance, when users repeatedly modify AI-generated report formats, the system autonomously learns and memorizes their preferences for subsequent outputs. These personalized preferences can be shared across teams or entire organizations, enhancing knowledge reuse and efficiency.

To ensure output quality, Dingtalk-DeepResearch incorporates the DingAutoEvaluator assessment system, which conducts multi-dimensional "quality checks" on generated reports, covering data accuracy, logical coherence, and tool usage standards. If issues are detected, the system automatically feeds them back into the training process for model optimization, forming a closed-loop improvement cycle.

Currently, Dingtalk-DeepResearch has been stably deployed in real-world business scenarios, delivering value. In supply chain management, it rapidly analyzes cross-departmental complex tabular data to provide intelligent procurement recommendations. In manufacturing, it converts raw equipment data into visual analysis reports for predictive maintenance and decision-making. All core functionalities have been validated through international benchmark tests, ensuring reliability and technological leadership.

DingTalk's CTO Zhu Hong stated, "Dingtalk-DeepResearch combines adaptive optimization and multimodal reasoning to create a flexible, enterprise-grade AI framework capable of handling complex and evolving real-world tasks. This technology is accelerating deployment in AI search, AI tables, automated workflows, and Agent platforms, bringing cutting-edge AI closer to practical production needs and delivering tangible value to enterprises."

免責聲明：投資有風險，本文並非投資建議，以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請，作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考，不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證，投資者應自行研究並在投資前尋求專業建議。

老虎證券

Global Top 2, Domestic No.1! DingTalk AI Achieves Major Breakthrough, Outperforms OpenAI and Claude in DeepResearch Benchmark

熱議股票