媒体播放器通用框架 FFmpeg 推出 AI 语音识别功能

市场资讯

Aug 13, 2025

　　炒股就看金麒麟分析师研报，权威，专业，及时，全面，助您挖掘潜力主题机会！

（来源：IT之家）

IT之家 8 月 13 日消息，FFmpeg 是一个流行的开源媒体播放器通用框架，现在包含了一个新的 af_whisper 音频工具，可以直接在 FFmpeg 生态系统中实现自动语音识别（ASR）。

该工具使用了 whisper.cpp 库，为媒体处理工作流程添加了一个 AI 模型，允许进行灵活的音频转译文本，包括选择 AI 模型、指定语言以及设置输出格式，如文本、SRT 或 JSON。

该工具可以处理预录制的文件和实时音频流，用户还可以使用语音激活检测（VAD）来提高转写的准确性和效率。

IT之家注意到，该工具还支持 GPU 加速，可以显著加快转写过程。对于用户来说，这一功能取代了对外部、多步骤转写过程的需求，将任务整合到一个高效的单命令行工作流程中。

海量资讯、精准解读，尽在新浪财经APP

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

1
2
3
4
5
6
7
8
9
10

{"basename":"","ssrTDKData":{"titleTemplate":"%s - Tiger Brokers","title":"Tiger Brokers | Global Stocks, Options & Futures Trading App","description":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","keywords":"tiger brokers,tiger trade,tiger brokers singapore,broker online,stock trading in singapore,share trading singapore,brokerage firm singapore,trading app,stock broker singapore,stock trading platforms,trading account","social":{"ogDescription":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","ogImage":"https://c1.itigergrowtha.com/portal5/static/media/og-logo.be62fbe1.png","ogUrl":"https://www.itiger.com/news/2559832983"},"companyName":"Tiger Brokers"},"pageData":{"isMobile":false,"isTiger":false,"isTTM":true,"region":"SGP","license":"TBSG","edition":"fundamental"},"isCrawlerRequest":true,"__swrFallback__":{"@#url:\"https://stock-news.skytigris.cn/v3/news\",params:#id:\"2559832983\",edition:\"fundamental\",auth_exemption:1,,,undefined,":{"share":"https://ttm.financial/m/news/2559832983?lang=en_US&edition=fundamental","thumbnail":"","is_english":false,"pubTime":"2025-08-13 22:55","share_image_url":"https://static.laohu8.com/b0d1b7e8843deea78cc308b15114de44","id":"2559832983","market":"fut","top_or_hot":-1,"title":"媒体播放器通用框架 FFmpeg 推出 AI 语音识别功能","media":"市场资讯","content":"<html><body><div>\n<blockquote><p>　　炒股就看<a href=\"https://laohu8.com/S/603586\">金麒麟</a>分析师研报，权威，专业，及时，全面，助您挖掘潜力主题机会！</p></blockquote> <p>（来源：IT之家）</p><p cms-style=\"font-L\">IT之家 8 月 13 日消息，FFmpeg 是一个流行的开源媒体播放器通用框架，现在包含了一个新的 <font cms-style=\"font-L strong-Bold\">af_whisper 音频工具</font>，可以直接在 FFmpeg 生态系统中实现自动语音识别（ASR）。</p><div><img src=\"http://n.sinaimg.cn/spider20250813/68/w660h208/20250813/ad7d-00242290c277c5ba8461da56ccdf303c.jpg\"/><span></span></div><p cms-style=\"font-L\">该工具使用了 whisper.cpp 库，为媒体处理工作流程添加了一个 AI 模型，允许进行灵活的音频转译文本，包括选择 AI 模型、指定语言以及设置输出格式，<font cms-style=\"font-L strong-Bold\">如文本、SRT 或 JSON</font>。</p><p cms-style=\"font-L\"><font cms-style=\"font-L strong-Bold\">该工具可以处理预录制的文件和实时音频流</font>，用户还可以使用语音激活检测（VAD）来提高转写的准确性和效率。</p><div><img src=\"http://n.sinaimg.cn/spider20250813/129/w660h269/20250813/ed8a-f821e52ab6d8624649cdccd112e55a7c.jpg\"/><span></span></div><p cms-style=\"font-L\">IT之家注意到，该工具还支持 <font cms-style=\"font-L strong-Bold\">GPU 加速</font>，可以显著加快转写过程。对于用户来说，这一功能取代了对外部、多步骤转写过程的需求，将任务整合到一个高效的单命令行工作流程中。</p>\n<div>\n<div><img src=\"\"/></div>\n<div>海量资讯、精准解读，尽在新浪财经APP</div>\n</div>\n</div></body></html>","source":"sina","html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>媒体播放器通用框架 FFmpeg 推出 AI 语音识别功能</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\n媒体播放器通用框架 FFmpeg 推出 AI 语音识别功能\n</h2>\n\n<h4 class=\"meta\">\n\n\n2025-08-13 22:55 北京时间&nbsp;&nbsp;&nbsp;<a href=https://finance.sina.com.cn/stock/t/2025-08-13/doc-infkwfyu2993737.shtml><strong>市场资讯</strong></a>\n\n\n</h4>\n\n</header>\n<article>\n<div>\n<p>炒股就看金麒麟分析师研报，权威，专业，及时，全面，助您挖掘潜力主题机会！ （来源：IT之家）IT之家 8 月 13 日消息，FFmpeg 是一个流行的开源媒体播放器通用框架，现在包含了一个新的 af_whisper 音频工具，可以直接在 FFmpeg 生态系统中实现自动语音识别（ASR）。该工具使用了 whisper.cpp 库，为媒体处理工作流程添加了一个 AI 模型，允许进行灵活的音频转译文本...</p>\n\n<a href=\"https://finance.sina.com.cn/stock/t/2025-08-13/doc-infkwfyu2993737.shtml\">Source Link</a>\n\n</div>\n\n\n</article>\n</div>\n</body>\n</html>\n","isBrief":false,"type":0,"news_type":1,"symbol":"LU2045819591.USD","symbol_name":"Natixis WCM Global Emerging Markets Equity R/A USD","start_time":0,"source_url":"https://finance.sina.com.cn/stock/t/2025-08-13/doc-infkwfyu2993737.shtml","article_id":"2559832983","we_media_id":null,"thumbnails":[],"rights":null,"url":"https://stock-news.laohu8.com/highlight/detail?id=2559832983","pubTimestamp":1755096900,"columns":[],"sourceInfo":{"source_id":"sina","name":"sina"},"weMediaInfo":null,"summary":"IT之家 8 月 13 日消息，FFmpeg 是一个流行的开源媒体播放器通用框架，现在包含了一个新的 af_whisper 音频工具，可以直接在 FFmpeg 生态系统中实现自动语音识别。该工具使用了 whisper.cpp 库，为媒体处理工作流程添加了一个 AI 模型，允许进行灵活的音频转译文本，包括选择 AI 模型、指定语言以及设置输出格式，如文本、SRT 或 JSON。IT之家注意到，该工具还支持 GPU 加速，可以显著加快转写过程。","collect":0,"end_time":0,"defaultTopTitle":"sina.com.cn","property":[],"viewcount":null,"language":"zh","relate_stocks":{"LU2045819591.USD":"Natixis WCM Global Emerging Markets Equity R/A USD","LU2125910500.SGD":"Natixis WCM Global Emerging Markets Equity H-R/A SGD","ASR":"墨西哥东南部机场","BK4174":"机场服务","LU0210535034.USD":"摩根大通拉丁美洲基金"},"translate_title":"FFmpeg, a common framework for media players, launches AI speech recognition function","themeId":null,"isJumpTheme":false,"ttsUrl":null,"symbols_score_info":{"ASR":1},"content_text":"炒股就看金麒麟分析师研报，权威，专业，及时，全面，助您挖掘潜力主题机会！ （来源：IT之家）IT之家 8 月 13 日消息，FFmpeg 是一个流行的开源媒体播放器通用框架，现在包含了一个新的 af_whisper 音频工具，可以直接在 FFmpeg 生态系统中实现自动语音识别（ASR）。该工具使用了 whisper.cpp 库，为媒体处理工作流程添加了一个 AI 模型，允许进行灵活的音频转译文本，包括选择 AI 模型、指定语言以及设置输出格式，如文本、SRT 或 JSON。该工具可以处理预录制的文件和实时音频流，用户还可以使用语音激活检测（VAD）来提高转写的准确性和效率。IT之家注意到，该工具还支持 GPU 加速，可以显著加快转写过程。对于用户来说，这一功能取代了对外部、多步骤转写过程的需求，将任务整合到一个高效的单命令行工作流程中。\n\n\n海量资讯、精准解读，尽在新浪财经APP","kind":"news","is_publish_news":true,"is_publish_highlight":false,"is_publish_live":false,"is_publish_wemedia":null,"editions":null,"column":"","sentiment":"0","news_tag":"","news_rank":0,"symbols":[],"gpt_button":0,"need_auth":false,"code":"91000000","status":"200"}}}