ChatGPT Performs on the Chinese National Medical Licensing Examination

Xinyi Wang,Zhenye Gong,Guoxin Wang,Jingdan Jia,Ying Xu,Jialu Zhao,Qingye Fan,Shaun Wu,Weiguo Hu,Xiaoyang Li,Li, Xiaoyang
DOI: https://doi.org/10.1007/s10916-023-01961-0
IF: 4.92
2023-08-16
Journal of Medical Systems
Abstract:ChatGPT, a language model developed by OpenAI, uses a 175 billion parameter Transformer architecture for natural language processing tasks. This study aimed to compare the knowledge and interpretation ability of ChatGPT with those of medical students in China by administering the Chinese National Medical Licensing Examination (NMLE) to both ChatGPT and medical students. We evaluated the performance of ChatGPT in three years' worth of the NMLE, which consists of four units. At the same time, the exam results were compared to those of medical students who had studied for five years at medical colleges. ChatGPT's performance was lower than that of the medical students, and ChatGPT's correct answer rate was related to the year in which the exam questions were released. ChatGPT's knowledge and interpretation ability for the NMLE were not yet comparable to those of medical students in China. It is probable that these abilities will improve through deep learning.
health care sciences & services,medical informatics
What problem does this paper attempt to address?