HuatuoGPT, Towards Taming Language Models To Be a Doctor
Hongbo Zhang,Junying Chen,Feng Jiang,Fei Yu,Zhihong Chen,Jianquan Li,Guiming Hardy Chen,Xiangbo Wu,Zhiyi Zhang,Qingying Xiao,Xiang Wan,Benyou Wang,Haizhou Li,Yuntao Bai,Saurav Kadavath,Sandipan Kundu,Amanda Askell,John Kernion,Andy Jones,Anna Chen,Anna Goldie,Azalia Mirhoseini,C. McKinnon,Tiannan Wang,Juhao Liang,Chen Zhang,Wei-Lin Chiang,Zhuohan Li,Ziyue Lin,Ying Sheng,Zhanghao Wu,Hao Zhang,Lianmin Zheng,Siyuan Zhuang,Yonghao Zhuang,Joseph E. Gonzalez,Ion Stoica,Eric P. Xing. 2023,Vicuna,T. Han,Lisa C. Adams,Jens-Michalis Papaioan-nou,Paul Grundmann,Tom Oberhauser,Alexander Löser,Daniel Truhn,Keno K. Bressem. 2023,Dan Hendrycks,Collin Burns,Steven Basart,Andy Zou,Mantas Mazeika,Xuechen Li,Tianyi Zhang,Yann Dubois,Rohan Taori,Ishaan Gulrajani,Carlos Guestrin,Percy Liang,B. Hashimoto,Alpacaeval,Yunxiang Li,Zihan Li,Kai Zhang,Ruilong Dan,Hongcheng Liu,Yusheng Liao,Yutong Meng,Yu Wang,Yanfeng Wang,型. Medicalgpt-zh:中文医疗对话语言模,https
Abstract:In this paper, we present HuatuoGPT, a Large Language Model (LLM) for medical consulta-tion. The core recipe of HuatuoGPT is to leverage both distilled data from ChatGPT and real-world data from doctors in the supervised fine-tuning stage. This is not only because purely using ChatGPT -distilled data might cause ‘model collapse’, but also because real-world data from doctors would be complementary to ChatGPT -distilled data. The responses from ChatGPT are usually detailed, well-presented, fluent, and instruction-followed, but it cannot perform like a doctor in many aspects, e.g. for interactive diagnosis. Therefore, the extra doctors’ data could tame a distilled language model to perform like doctors. To synergize the strengths of both data sources, we introduce RLMF (Reinforcement Learning from Mixed Feedback) where a reward model is trained to align the language model with the merits that both sources (ChatGPT and doctors) bring. Experimental results (in GPT-4 evaluation, human evaluation, and medical benchmark datasets) demonstrate that HuatuoGPT achieves state-of-the-art results in performing medical consulta-tion