美团LongCat-Video-Avatar发布并开源,重点提升动作拟真度

凤凰网科技
Dec 18, 2025

凤凰网科技讯 12月18日,美团LongCat团队正式发布并开源虚拟人视频生成模型LongCat-Video-Avatar。该模型基于其此前开源的LongCat-Video基座构建,支持通过音频、文本或图像生成虚拟人视频,并具备视频续写功能。

据介绍,新模型重点提升了动作拟真度、长视频生成稳定性与身份一致性。其通过“解耦无条件引导”技术使虚拟人在语音间歇也能呈现眨眼、调整姿势等自然状态。针对长视频生成中常见的画面质量退化问题,团队提出了“跨片段隐空间拼接”策略,旨在避免重复编解码带来的累积误差,声称可支持生成长达5分钟的视频并保持画面稳定。

在身份一致性方面,模型采用了带位置编码的参考帧注入与“参考跳跃注意力”机制,以在保持角色特征的同时减少动作僵化。团队表示,在HDTF、CelebV-HQ等公开数据集的评测中,该模型在唇音同步精度与一致性指标上达到当前先进水平,并在涵盖商业推广、知识教育等场景的综合测试中表现领先。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10