Lei Jun Announces Second Audio Encoder Capability Challenge to Debut at Interspeech 2026, Registrations Now Open

Deep News
2025/12/15

On December 15, Lei Jun, founder, chairman, and CEO of Xiaomi, announced that the second Audio Encoder Capability Challenge (AECC), jointly initiated by Xiaomi, the University of Surrey, Tsinghua University, and Haitai Ruisheng, will debut alongside the prestigious international speech conference Interspeech 2026 in September next year. Registrations are now officially open.

Lei Jun stated that the competition aims to enhance the efficiency of audio encoders for large audio language models (LALMs) and encouraged participants to sign up.

Interspeech 2026, a top-tier global speech conference, will be held in Sydney, Australia, in September next year. The second AECC, co-organized by Xiaomi, the University of Surrey, Tsinghua University, and Haitai Ruisheng, will run concurrently with the event.

Currently, large audio language models (LALMs) are advancing rapidly, but most mainstream models rely heavily on a single audio front-end encoder, primarily OpenAI’s Whisper Encoder. This dependence limits architectural diversity and hinders further improvements in LALMs' overall capabilities. To address growing demands for audio comprehension, the challenge will focus on evaluating audio encoders' understanding and feature representation in complex real-world scenarios.

**1. Competition Overview** **1.1 Evaluation Method** The challenge adopts a unified end-to-end training and evaluation framework. Participants need only submit pre-trained encoder models, while downstream task training and evaluation will be handled by the organizers. The open-source evaluation system, XARES-LLM (https://github.com/xiaomi-research/xares-llm), automatically trains a typical LALM based on submitted encoders, downloads training data, tests downstream tasks, and provides scores.

Participants are not required to run XARES-LLM themselves. Instead, they need only package their audio encoders according to provided guidelines and email them to the organizers for large model training and evaluation. However, as XARES-LLM is open-source and can be run on a GTX 4090, participants may also use it to assess their encoders' performance before submission.

**1.2 Training Data** Unlike most competitions, this challenge emphasizes both model design and data utilization. No specific training dataset is mandated—participants may use any publicly accessible data, including web-scraped sources, but proprietary data is prohibited. Models can be built on open-source pre-trained parameters or trained from scratch.

Haitai Ruisheng has provided a supplementary dataset, free for participants, derived from eight commercial datasets (e.g., King-ASR-457, King-ASR-958). It covers diverse environmental noises (e.g., bookstores, gyms, subways, restaurants), household background noises, non-speech interference (e.g., water flow, footsteps), and vehicle-related sounds (e.g., mechanical noise, wind noise). Details: https://dataoceanai.github.io/Interspeech2026-Audio-Encoder-Challenge/King_NonSpeech-Dataset_en_20h.html

**1.3 Tracks** Two tracks are available: - **Track A**: Focuses on traditional classification tasks. - **Track B**: Evaluates comprehension and expressive capabilities. Submissions will be assessed in both tracks, with independent rankings.

**2. Registration & Submission** **2.1 Process** - Register by January 25, 2026 (AoE): https://docs.google.com/forms/d/1oaTnhh0HVX8K2oRdHKXsnyZfBWb7F6Oj8xZ6yAiMI74/viewform - Package encoder code/model files (zip) and email by February 12, 2026 (AoE). - Submit a technical report (PDF) by February 25, 2026 (AoE), which may also serve as a conference paper for Interspeech.

**2.2 Contact** Email: 2026interspeech-aecc@dataoceanai.com Official site: https://dataoceanai.github.io/Interspeech2026-Audio-Encoder-Challenge/

免責聲明:投資有風險,本文並非投資建議,以上內容不應被視為任何金融產品的購買或出售要約、建議或邀請,作者或其他用戶的任何相關討論、評論或帖子也不應被視為此類內容。本文僅供一般參考,不考慮您的個人投資目標、財務狀況或需求。TTM對信息的準確性和完整性不承擔任何責任或保證,投資者應自行研究並在投資前尋求專業建議。

熱議股票

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10