OpenAI开源HealthBench，60个国家合力开发5000段真实对话

金色财经

May 13, 2025

【OpenAI开源HealthBench，60个国家合力开发5000段真实对话】金色财经报道，OpenAI开源了一个专门面向医疗大模型的测试评估集——HealthBench。与以往测试集不同的是，该测试集的5000段核心测试对话，全部由来自60个国家/地区的26个专业262名医生打造，极大增强了该测试集的难度、真实性以及丰富度。并且采用了多轮对话测试，而不是简单的答题或选择题模式。根据测试数据显示，大模型在医疗保健领域的表现有了显著提升。例如，从之前的GPT-3.5Turbo的16%到GPT-4o的32%，再到o3的60%，整体性能有了显著进步。尤其是小型模型的进步更为突出，GPT-4.1nano不仅在性能上超越了GPT-4o，而且成本降低了25倍。

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

{"basename":"","ssrTDKData":{"titleTemplate":"%s - Tiger Brokers","title":"Tiger Brokers | Global Stocks, Options & Futures Trading App","description":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","keywords":"tiger brokers,tiger trade,tiger brokers singapore,broker online,stock trading in singapore,share trading singapore,brokerage firm singapore,trading app,stock broker singapore,stock trading platforms,trading account","social":{"ogDescription":"Tiger Brokers, one-stop investment in US stocks, SGX stocks, HK stocks, A-shares & other global assets. One of the best stock trading platforms in Singapore.","ogImage":"https://c1.itigergrowtha.com/portal5/static/media/og-logo.be62fbe1.png","ogUrl":"https://www.itiger.com/news/2535119735"},"companyName":"Tiger Brokers"},"pageData":{"isMobile":false,"isTiger":false,"isTTM":true,"region":"SGP","license":"TBSG","edition":"fundamental"},"isCrawlerRequest":true,"__swrFallback__":{"@#url:\"https://stock-news.skytigris.cn/v3/news\",params:#id:\"2535119735\",edition:\"fundamental\",auth_exemption:1,,,undefined,":{"share":"https://ttm.financial/m/news/2535119735?lang=en_US&edition=fundamental","thumbnail":"","is_english":false,"pubTime":"2025-05-13 06:56","share_image_url":"https://static.laohu8.com/e9f99090a1c2ed51c021029395664489","id":"2535119735","market":null,"top_or_hot":-1,"title":"OpenAI开源HealthBench，60个国家合力开发5000段真实对话","media":"金色财经","content":"<html><body><p>【OpenAI开源HealthBench，60个国家合力开发5000段真实对话】金色财经报道，OpenAI开源了一个专门面向医疗大模型的测试评估集——HealthBench。与以往测试集不同的是，该测试集的5000段核心测试对话，全部由来自60个国家/地区的26个专业262名医生打造，极大增强了该测试集的难度、真实性以及丰富度。并且采用了多轮对话测试，而不是简单的答题或选择题模式。根据测试数据显示，大模型在医疗保健领域的表现有了显著提升。例如，从之前的GPT-3.5Turbo的16%到GPT-4o的32%，再到o3的60%，整体性能有了显著进步。尤其是小型模型的进步更为突出，GPT-4.1nano不仅在性能上超越了GPT-4o，而且成本降低了25倍。</p></body></html>","source":"jinse_live","html":"<!DOCTYPE html>\n<html>\n<head>\n<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />\n<meta name=\"viewport\" content=\"width=device-width,initial-scale=1.0,minimum-scale=1.0,maximum-scale=1.0,user-scalable=no\"/>\n<meta name=\"format-detection\" content=\"telephone=no,email=no,address=no\" />\n<title>OpenAI开源HealthBench，60个国家合力开发5000段真实对话</title>\n<style type=\"text/css\">\na,abbr,acronym,address,applet,article,aside,audio,b,big,blockquote,body,canvas,caption,center,cite,code,dd,del,details,dfn,div,dl,dt,\nem,embed,fieldset,figcaption,figure,footer,form,h1,h2,h3,h4,h5,h6,header,hgroup,html,i,iframe,img,ins,kbd,label,legend,li,mark,menu,nav,\nobject,ol,output,p,pre,q,ruby,s,samp,section,small,span,strike,strong,sub,summary,sup,table,tbody,td,tfoot,th,thead,time,tr,tt,u,ul,var,video{ font:inherit;margin:0;padding:0;vertical-align:baseline;border:0 }\nbody{ font-size:16px; line-height:1.5; color:#999; background:transparent; }\n.wrapper{ overflow:hidden;word-break:break-all;padding:10px; }\nh1,h2{ font-weight:normal; line-height:1.35; margin-bottom:.6em; }\nh3,h4,h5,h6{ line-height:1.35; margin-bottom:1em; }\nh1{ font-size:24px; }\nh2{ font-size:20px; }\nh3{ font-size:18px; }\nh4{ font-size:16px; }\nh5{ font-size:14px; }\nh6{ font-size:12px; }\np,ul,ol,blockquote,dl,table{ margin:1.2em 0; }\nul,ol{ margin-left:2em; }\nul{ list-style:disc; }\nol{ list-style:decimal; }\nli,li p{ margin:10px 0;}\nimg{ max-width:100%;display:block;margin:0 auto 1em; }\nblockquote{ color:#B5B2B1; border-left:3px solid #aaa; padding:1em; }\nstrong,b{font-weight:bold;}\nem,i{font-style:italic;}\ntable{ width:100%;border-collapse:collapse;border-spacing:1px;margin:1em 0;font-size:.9em; }\nth,td{ padding:5px;text-align:left;border:1px solid #aaa; }\nth{ font-weight:bold;background:#5d5d5d; }\n.symbol-link{font-weight:bold;}\n/* header{ border-bottom:1px solid #494756; } */\n.title{ margin:0 0 8px;line-height:1.3;color:#ddd; }\n.meta {color:#5e5c6d;font-size:13px;margin:0 0 .5em; }\na{text-decoration:none; color:#2a4b87;}\n.meta .head { display: inline-block; overflow: hidden}\n.head .h-thumb { width: 30px; height: 30px; margin: 0; padding: 0; border-radius: 50%; float: left;}\n.head .h-content { margin: 0; padding: 0 0 0 9px; float: left;}\n.head .h-name {font-size: 13px; color: #eee; margin: 0;}\n.head .h-time {font-size: 11px; color: #7E829C; margin: 0;line-height: 11px;}\n.small {font-size: 12.5px; display: inline-block; transform: scale(0.9); -webkit-transform: scale(0.9); transform-origin: left; -webkit-transform-origin: left;}\n.smaller {font-size: 12.5px; display: inline-block; transform: scale(0.8); -webkit-transform: scale(0.8); transform-origin: left; -webkit-transform-origin: left;}\n.bt-text {font-size: 12px;margin: 1.5em 0 0 0}\n.bt-text p {margin: 0}\n</style>\n</head>\n<body>\n<div class=\"wrapper\">\n<header>\n<h2 class=\"title\">\nOpenAI开源HealthBench，60个国家合力开发5000段真实对话\n</h2>\n\n<h4 class=\"meta\">\n\n\n2025-05-13 06:56 北京时间&nbsp;&nbsp;&nbsp;<a href=https://finance.sina.com.cn/7x24/2025-05-13/doc-inewkfen2383083.shtml><strong>金色财经</strong></a>\n\n\n</h4>\n\n</header>\n<article>\n<div>\n<p>【OpenAI开源HealthBench，60个国家合力开发5000段真实对话】金色财经报道，OpenAI开源了一个专门面向医疗大模型的测试评估集——HealthBench。与以往测试集不同的是，该测试集的5000段核心测试对话，全部由来自60个国家/地区的26个专业262名医生打造，极大增强了该测试集的难度、真实性以及丰富度。并且采用了多轮对话测试，而不是简单的答题或选择题模式。根据测试数据显示...</p>\n\n<a href=\"https://finance.sina.com.cn/7x24/2025-05-13/doc-inewkfen2383083.shtml\">Source Link</a>\n\n</div>\n\n\n</article>\n</div>\n</body>\n</html>\n","isBrief":false,"type":0,"news_type":1,"symbol":"BK7087","symbol_name":"多样化房地产投资信托v","start_time":0,"source_url":"https://finance.sina.com.cn/7x24/2025-05-13/doc-inewkfen2383083.shtml","article_id":"2535119735","we_media_id":null,"thumbnails":[],"rights":null,"url":"https://stock-news.laohu8.com/highlight/detail?id=2535119735","pubTimestamp":1747090562,"columns":[],"sourceInfo":{"source_id":"jinse_live","name":"金色财经"},"weMediaInfo":null,"summary":"【OpenAI开源HealthBench，60个国家合力开发5000段真实对话】金色财经报道，OpenAI开源了一个专门面向医疗大模型的测试评估集——HealthBench。与以往测试集不同的是，该测","collect":0,"end_time":0,"defaultTopTitle":"sina.com.cn","property":[],"viewcount":null,"language":"zh","relate_stocks":{"159891":"医疗","BK7087":"多样化房地产投资信托v","GPT.AU":"GPT GROUP","BK7510":"REITs概念"},"translate_title":"OpenAI open source HealthBench, 60 countries work together to develop 5,000 real conversations","themeId":null,"isJumpTheme":false,"ttsUrl":null,"symbols_score_info":{"159891":1,"GPT.AU":1},"content_text":"【OpenAI开源HealthBench，60个国家合力开发5000段真实对话】金色财经报道，OpenAI开源了一个专门面向医疗大模型的测试评估集——HealthBench。与以往测试集不同的是，该测试集的5000段核心测试对话，全部由来自60个国家/地区的26个专业262名医生打造，极大增强了该测试集的难度、真实性以及丰富度。并且采用了多轮对话测试，而不是简单的答题或选择题模式。根据测试数据显示，大模型在医疗保健领域的表现有了显著提升。例如，从之前的GPT-3.5Turbo的16%到GPT-4o的32%，再到o3的60%，整体性能有了显著进步。尤其是小型模型的进步更为突出，GPT-4.1nano不仅在性能上超越了GPT-4o，而且成本降低了25倍。","kind":"live","is_publish_news":false,"is_publish_highlight":false,"is_publish_live":true,"is_publish_wemedia":null,"editions":null,"column":"","sentiment":"0","news_tag":"","news_rank":0,"symbols":[],"gpt_button":0,"need_auth":false,"code":"91000000","status":"200"}}}