通义全尺寸GUI智能体基座模型MAI-UI开源:原生具备用户交互能力

凤凰网科技
Dec 29, 2025

凤凰网科技讯 12月29日,通义实验室多模态交互团队近日宣布开源其通用GUI智能体基座模型MAI-UI。该模型旨在通过理解屏幕界面并执行操作,完成跨应用、多步骤的复杂任务,例如查询车票、在通讯群组同步信息、调整会议安排等。

据介绍,MAI-UI具备在指令不明确时主动向用户提问澄清的能力,并支持通过调用结构化工具(如地图搜索、路线规划API)来替代繁琐的界面点击操作,以提高任务执行的成功率与效率。该模型家族包含2B、8B等不同参数规模的版本,其中2B与8B模型已开源。

根据团队公布的评测数据,MAI-UI在ScreenSpot-Pro、AndroidWorld等多个GUI理解与任务执行基准测试中取得了当前领先的成绩。该模型适用于手机、电脑等不同操作系统的界面交互场景。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10