英伟达联合团队推出Fast-dLLM框架，大幅提升扩散模型推理速度

DoNews

Jun 03, 2025

近日，英伟达与麻省理工学院、香港大学合作推出Fast-dLLM框架，旨在解决扩散模型（Diffusion-based LLMs）在实际应用中的效率瓶颈。尽管扩散模型采用双向注意力机制具备理论优势，但其高昂的计算成本和多词元同步解码时的质量下降问题，限制了其广泛应用。Fast-dLLM通过引入块状近似KV缓存机制和置信度感知并行解码策略，显著优化性能。其中，KV缓存将序列划分为块并预计算激活值，减少...

Source Link

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

{"basename":"","ssrTDKData":{"titleTemplate":"%s - Tiger Brokers","title":"Tiger Brokers | Global Stocks, Options & Futures Trading App","description":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","keywords":"tiger brokers,tiger trade,tiger brokers singapore,broker online,stock trading in singapore,share trading singapore,brokerage firm singapore,trading app,stock broker singapore,stock trading platforms,trading account","social":{"ogDescription":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","ogImage":"https://c1.itigergrowtha.com/portal5/static/media/og-logo.be62fbe1.png","ogUrl":"https://www.itiger.com/news/2540745478"},"companyName":"Tiger Brokers"},"pageData":{"isMobile":false,"isTiger":false,"isTTM":true,"region":"SGP","license":"TBSG","edition":"fundamental"},"isCrawlerRequest":true,"__swrFallback__":{"@#url:\"https://stock-news.skytigris.cn/v3/news\",params:#id:\"2540745478\",edition:\"fundamental\",auth_exemption:1,,,undefined,":{"share":"https://ttm.financial/m/news/2540745478?lang=en_US&edition=fundamental","thumbnail":"","is_english":false,"pubTime":"2025-06-03 11:36","share_image_url":"https://static.laohu8.com/9a95c1376e76363c1401fee7d3717173","id":"2540745478","market":"us","top_or_hot":-1,"title":"英伟达联合团队推出Fast-dLLM框架，大幅提升扩散模型推理速度","media":"DoNews","content":"<div>\n<p>近日，英伟达与麻省理工学院、香港大学合作推出Fast-dLLM框架，旨在解决扩散模型（Diffusion-based LLMs）在实际应用中的效率瓶颈。尽管扩散模型采用双向注意力机制具备理论优势，但其高昂的计算成本和多词元同步解码时的质量下降问题，限制了其广泛应用。Fast-dLLM通过引入块状近似KV缓存机制和置信度感知并行解码策略，显著优化性能。其中，KV缓存将序列划分为块并预计算激活值，减少...</p>\n\n<a href=\"http://gu.qq.com/resources/shy/news/detail-v2/index.html#/?id=nesSN20250603113812976f4026&s=b\">Source Link</a>\n\n</div>\n","source":"tencent","html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>英伟达联合团队推出Fast-dLLM框架，大幅提升扩散模型推理速度</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\n英伟达联合团队推出Fast-dLLM框架，大幅提升扩散模型推理速度\n</h2>\n\n<h4 class=\"meta\">\n\n\n2025-06-03 11:36 北京时间&nbsp;&nbsp;&nbsp;<a href=http://gu.qq.com/resources/shy/news/detail-v2/index.html#/?id=nesSN20250603113812976f4026&s=b><strong>DoNews</strong></a>\n\n\n</h4>\n\n</header>\n<article>\n<div>\n<p>近日，英伟达与麻省理工学院、香港大学合作推出Fast-dLLM框架，旨在解决扩散模型（Diffusion-based LLMs）在实际应用中的效率瓶颈。尽管扩散模型采用双向注意力机制具备理论优势，但其高昂的计算成本和多词元同步解码时的质量下降问题，限制了其广泛应用。Fast-dLLM通过引入块状近似KV缓存机制和置信度感知并行解码策略，显著优化性能。其中，KV缓存将序列划分为块并预计算激活值，减少...</p>\n\n<a href=\"http://gu.qq.com/resources/shy/news/detail-v2/index.html#/?id=nesSN20250603113812976f4026&s=b\">Source Link</a>\n\n</div>\n\n\n</article>\n</div>\n</body>\n</html>\n","isBrief":false,"type":0,"news_type":1,"symbol":"LU1852331112.SGD","symbol_name":"Blackrock World Technology Fund A2 SGD-H","start_time":0,"source_url":"http://gu.qq.com/resources/shy/news/detail-v2/index.html#/?id=nesSN20250603113812976f4026&s=b","article_id":"2540745478","we_media_id":null,"thumbnails":[],"rights":{"source":"tencent","url":"http://gu.qq.com/resources/shy/news/detail-v2/index.html#/?id=nesSN20250603113812976f4026&s=b","rn_cache_url":null,"customStyle":"body{padding-top:10px;}#news_title{font-weight:bold;#titleStyle#;}#news_description span{font-size:12px;#descriptionStyle#;}.footer-note{#statement#}","selectors":".mod-LoadTzbdNews, body","filters":".relate-stock, .hot-list, .recom-box, .wx-sou","directOrigin":true},"url":"https://stock-news.laohu8.com/highlight/detail?id=2540745478","pubTimestamp":1748921762,"columns":[],"sourceInfo":{"source_id":"tencent","name":"腾讯"},"weMediaInfo":null,"summary":"近日，英伟达与麻省理工学院、香港大学合作推出Fast-dLLM框架，旨在解决扩散模型在实际应用中的效率瓶颈。尽管扩散模型采用双向注意力机制具备理论优势，但其高昂的计算成本和多词元同步解码时的质量下降问题，限制了其广泛应用。Fast-dLLM通过引入块状近似KV缓存机制和置信度感知并行解码策略，显著优化性能。这项研究有效平衡了速度与质量，为扩散模型在语言生成任务中的实际应用开辟了新路径。","collect":0,"end_time":0,"defaultTopTitle":"qq.com","property":[],"viewcount":null,"language":"zh","relate_stocks":{"LU1852331112.SGD":"Blackrock World Technology Fund A2 SGD-H","LU2247934214.USD":"FIDELITY FUNDS SUSTAINABLE FUTURE CONNECTIVITY \"A\" (USD) ACC","BK4585":"ETF&股票定投概念","NVDY":"NVDA期权收益策略ETF-YieldMax","LU0868494617.USD":"UBS (LUX) EQUITY SICAV - US TOTAL YIELD SUSTAINABLE \"P\" (USD) ACC","LU2433249047.HKD":"THEMATICS META \"R/A\" (HKD) ACC","LU2097344357.USD":"SCHRODER ISF SUSTAINABLE MULTI-ASSET INCOME \"A\" (USDHDG) ACC","LU2430703095.HKD":"WELLINGTON MULTI-ASSET HIGH INCOME \"AM4\" (HKD) INC","IE0034235295.USD":"PINEBRIDGE GLOBAL DYNAMIC ASSET ALLOCATION \"A\" (USD) ACC","BK4543":"AI","NVDD":"1倍做空NVDA ETF-Direxion","SG9999015945.SGD":"LionGlobal Disruptive Innovation Fund A SGD","IE00BJTD4V19.USD":"NEUBERGER BERMAN US LONG SHORT EQUITY \"A1\" (USD) ACC","FAST":"快扣","LU1951200564.SGD":"Natixis Thematics AI & Robotics Fund R/A SGD","LU1989772840.SGD":"CPR Invest - Climate Action A2 Acc SGD-H","SG9999015986.USD":"LIONGLOBAL DISRUPTIVE INNOVATION \"I\" (USD) ACC","LU2360107168.USD":"BGF NEXT GENERATION TECHNOLOGY \"A4\" (USD) INC","3NVD.UK":"LS 3X NVIDIA","LU0724617625.USD":"BGF GLOBAL ALLOCATION \"A4\" (USD) INC","LU1066053197.SGD":"HSBC GIF GLOBAL EQUITY VOLATILITY FOCUSED \"AM3\" (SGDHDG) INC","SNVD.UK":"LS -1X NVIDIA","LU0965508806.USD":"AB LOW VOLATILITY EQUITY PORTFOLIO \"AD\" (USD) INC","BK4614":"Manus概念股","BK4588":"碎股","NVD2.UK":"2X NVIDIA ETP","LU0323591593.USD":"SCHRODER ISF QEP GLOBAL QUALITY \"A\" (USD) ACC","LU2065171311.SGD":"M&G (LUX) GLOBAL MAXIMA \"A\" (SGD) ACC","LU0198837287.USD":"UBS (LUX) EQUITY SICAV - USA GROWTH \"P\" (USD) ACC","LU1868836914.USD":"CT (LUX) I AMERICAN \"3\" (USD) ACC","LU0109391861.USD":"富兰克林美国机遇基金A Acc","SGXZ23171101.USD":"NIKKO AM SHENTON GLOBAL OPPORTUNITIES (USD) ACC","LU2506951792.HKD":"BNP PARIBAS ENERGY TRANSITION \"CRH\" (HKDHDG)       ACC","NVD3.UK":"LS 3X NVIDIA","2NVD.UK":"2X NVIDIA ETP","LU0096362180.USD":"CT (LUX) I GLOBAL FOCUS \"DU\" (USD)","NVDS":"1.5倍做空NVDA ETF-Tradr","LU2242650005.HKD":"FIDELITY FUNDS GLOBAL MULTI ASSET DYNAMIC \"A\" (HKD) ACC","NVDS.UK":"LS -1X NVIDIA","IE00BK4W5M84.HKD":"HSBC GLOBAL FUNDS ICAV US EQUITY INDEX \"HC\" (HKD) ACC","NVDA":"英伟达","NVDU":"2倍做多NVDA ETF-Direxion","NVIW.SI":"NVDA 3xLongSG261006","LU2461242641.AUD":"WELLINGTON US QUALITY GROWTH \"A\" (AUDHDG) ACC","LU0056508442.USD":"贝莱德世界科技基金A2","IE00BJTD4N35.SGD":"Neuberger Berman US Long Short Equity A1  Acc SGD-H","LU0823414551.USD":"BNP PARIBAS ENERGY TRANSITION \"C\" (USD) INC"},"translate_title":"Nvidia joint team launches Fast-dLLM framework to significantly improve the inference speed of diffusion models","themeId":null,"isJumpTheme":false,"ttsUrl":null,"symbols_score_info":{"3NVD.UK":0.6,"NVDD":0.6,"NVDY":0.6,"NVIW.SI":0.6,"2NVD.UK":0.6,"NVDS.UK":0.6,"NVD2.UK":0.6,"NVD3.UK":0.6,"SNVD.UK":0.6,"NVDS":0.6,"FAST":1,"NVDU":0.6,"NVDA":1},"content_text":"近日，英伟达与麻省理工学院、香港大学合作推出Fast-dLLM框架，旨在解决扩散模型（Diffusion-based LLMs）在实际应用中的效率瓶颈。尽管扩散模型采用双向注意力机制具备理论优势，但其高昂的计算成本和多词元同步解码时的质量下降问题，限制了其广泛应用。Fast-dLLM通过引入块状近似KV缓存机制和置信度感知并行解码策略，显著优化性能。其中，KV缓存将序列划分为块并预计算激活值，减少冗余计算；置信度解码则通过选择性处理高置信度词元，避免依赖冲突。测试结果显示，该框架在GSM8K数据集上实现27.6倍加速，准确率达76.0%，同时在其他基准测试中也表现出色。这项研究有效平衡了速度与质量，为扩散模型在语言生成任务中的实际应用开辟了新路径。","kind":"news","is_publish_news":true,"is_publish_highlight":false,"is_publish_live":false,"is_publish_wemedia":null,"editions":null,"column":"","sentiment":"1","news_tag":"productRelease","news_rank":0,"symbols":[],"gpt_button":0,"need_auth":false,"code":"91000000","status":"200"}}}