Tether 开源 TurboQuant,本地 AI 设备 KV 缓存压缩比最高可达 5 倍

链捕手
Yesterday

ChainCatcher 消息,Tether AI 研究团队宣布开源 TurboQuant 生产版本,并将其集成至 QVAC SDK 0.12.0。

TurboQuant 源自 Google Research 的内存压缩算法,可将 AI 运行时的 KV 缓存压缩最高 5 倍,同时保持接近未压缩模型的输出质量。

这意味着笔记本电脑、手机及边缘设备在无需将数据上传至云端的情况下,可处理更长的对话、更大的文件及更复杂的任务。

此次开源发布包含完整量化流水线、主流推理框架适配器及开发者文档,面向在消费级硬件、边缘设备及点对点网络上部署 AI 的开发者和初创团队。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10