Google Unveils Major Open-Source Release: Deep Research Agent Achieves SOTA, Costs 90% Less Than GPT-5 Pro

Deep News
2025/12/12

In the early hours of December 12, Google announced three major advancements in Agent technology, beating OpenAI by an hour. The updates include enhancements to the Deep Research Agent, now available to developers, the open-sourcing of a new benchmark called DeepSearchQA for evaluating web research tasks, and the launch of the Interactions API.

The **Gemini Deep Research Agent**, optimized for long-context data collection and synthesis, is powered by the **Gemini 3 Pro** model. It employs multi-step reinforcement learning to navigate complex information environments with high precision. Key updates include targeted web searches for specific data and lower-cost report generation.

Lukas Haas, Product Manager at Google DeepMind, revealed on X that the new Gemini Deep Research Agent has achieved state-of-the-art (SOTA) performance, scoring 46.4% on Google’s benchmark tests. It matches **GPT-5 Pro** on BrowseComp while costing only about 10% as much.

The **Deep Research Agent** will soon be integrated into Google Search, Notebooks, and Google Finance, with upgrades also coming to the Gemini app.

Meanwhile, **DeepSearchQA**, a new benchmark, features 900 manually designed "causal chain" tasks spanning 17 domains to evaluate Agents in complex, multi-step query scenarios. Unlike traditional fact-based tests, it emphasizes comprehensiveness, requiring Agents to generate detailed answer sets while tracking retrieval accuracy.

The **Interactions API** serves as a unified interface for engaging with Gemini models and Agents. Available in beta via Google AI Studio, it supports Agent Development Kits (ADK) and the A2A protocol. Developers praised the move, likening it to "handing them a digital Sherlock Holmes" for deep research automation.

### Key Features: 1. **Deep Research Agent** - Enhanced web search capabilities for precise data extraction. - Lower-cost, high-quality report generation. - Applications in finance, biotech, and market research. - Supports structured outputs (JSON), detailed citations, and user-controlled formatting.

2. **DeepSearchQA Benchmark** - Measures Agent performance in multi-step reasoning. - Internal tests show improved results with extended search and inference steps.

3. **Interactions API** - Server-side state management reduces client-side complexity. - Supports background execution and remote Model Context Protocol (MCP) tool integration.

Google plans further expansions, including native chart generation for visual reports and broader enterprise integration via Vertex AI.

**Open-source link**: [DeepSearchQA](https://www.kaggle.com/benchmarks/google/dsqa/leaderboard)

免責聲明:投資有風險,本文並非投資建議,以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請,作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考,不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證,投資者應自行研究並在投資前尋求專業建議。

熱議股票

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10