Lei Jun Announces Second Audio Encoder Capability Challenge to Debut at Interspeech 2026, Registrations Now Open

Deep News
2025/12/15

On December 15, Lei Jun, founder, chairman, and CEO of Xiaomi, announced that the second Audio Encoder Capability Challenge (AECC), jointly initiated by Xiaomi, the University of Surrey, Tsinghua University, and Haitai Ruisheng, will debut alongside the prestigious international speech conference Interspeech 2026 in September next year. Registrations are now officially open.

Lei Jun stated that the competition aims to enhance the efficiency of audio encoders for large audio language models (LALMs) and encouraged participants to sign up.

Interspeech 2026, a top-tier global speech conference, will be held in Sydney, Australia, in September next year. The second AECC, co-organized by Xiaomi, the University of Surrey, Tsinghua University, and Haitai Ruisheng, will run concurrently with the event.

Currently, large audio language models (LALMs) are advancing rapidly, but most mainstream models rely heavily on a single audio front-end encoder, primarily OpenAI’s Whisper Encoder. This dependence limits architectural diversity and hinders further improvements in LALMs' overall capabilities. To address growing demands for audio comprehension, the challenge will focus on evaluating audio encoders' understanding and feature representation in complex real-world scenarios.

**1. Competition Overview** **1.1 Evaluation Method** The challenge adopts a unified end-to-end training and evaluation framework. Participants need only submit pre-trained encoder models, while downstream task training and evaluation will be handled by the organizers. The open-source evaluation system, XARES-LLM (https://github.com/xiaomi-research/xares-llm), automatically trains a typical LALM based on submitted encoders, downloads training data, tests downstream tasks, and provides scores.

Participants are not required to run XARES-LLM themselves. Instead, they need only package their audio encoders according to provided guidelines and email them to the organizers for large model training and evaluation. However, as XARES-LLM is open-source and can be run on a GTX 4090, participants may also use it to assess their encoders' performance before submission.

**1.2 Training Data** Unlike most competitions, this challenge emphasizes both model design and data utilization. No specific training dataset is mandated—participants may use any publicly accessible data, including web-scraped sources, but proprietary data is prohibited. Models can be built on open-source pre-trained parameters or trained from scratch.

Haitai Ruisheng has provided a supplementary dataset, free for participants, derived from eight commercial datasets (e.g., King-ASR-457, King-ASR-958). It covers diverse environmental noises (e.g., bookstores, gyms, subways, restaurants), household background noises, non-speech interference (e.g., water flow, footsteps), and vehicle-related sounds (e.g., mechanical noise, wind noise). Details: https://dataoceanai.github.io/Interspeech2026-Audio-Encoder-Challenge/King_NonSpeech-Dataset_en_20h.html

**1.3 Tracks** Two tracks are available: - **Track A**: Focuses on traditional classification tasks. - **Track B**: Evaluates comprehension and expressive capabilities. Submissions will be assessed in both tracks, with independent rankings.

**2. Registration & Submission** **2.1 Process** - Register by January 25, 2026 (AoE): https://docs.google.com/forms/d/1oaTnhh0HVX8K2oRdHKXsnyZfBWb7F6Oj8xZ6yAiMI74/viewform - Package encoder code/model files (zip) and email by February 12, 2026 (AoE). - Submit a technical report (PDF) by February 25, 2026 (AoE), which may also serve as a conference paper for Interspeech.

**2.2 Contact** Email: 2026interspeech-aecc@dataoceanai.com Official site: https://dataoceanai.github.io/Interspeech2026-Audio-Encoder-Challenge/

免责声明:投资有风险,本文并非投资建议,以上内容不应被视为任何金融产品的购买或出售要约、建议或邀请,作者或其他用户的任何相关讨论、评论或帖子也不应被视为此类内容。本文仅供一般参考,不考虑您的个人投资目标、财务状况或需求。TTM对信息的准确性和完整性不承担任何责任或保证,投资者应自行研究并在投资前寻求专业建议。

热议股票

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10