MedGo: A Chinese Medical Large Language Model

Haitao Zhang,Bo An
2024-10-27
Abstract:Large models are a hot research topic in the field of artificial intelligence. Leveraging their generative capabilities has the potential to enhance the level and quality of medical services. In response to the limitations of current large language models, which often struggle with accuracy and have narrow capabilities in medical applications, this paper presents a Chinese medical large language model, MedGo. MedGo was trained using a combination of high quality unsupervised medical data, supervised data, and preference alignment data, aimed at enhancing both its versatility and precision in medical tasks. The model was evaluated through the public CBLUE benchmark and a manually constructed dataset ClinicalQA. The results demonstrate that MedGo achieved promising performance across various Chinese medical information processing tasks, achieved the first place in the CBLUE evaluation. Additionally, on our constructed dataset ClinicalQA, MedGo outperformed its base model Qwen2, highlighting its potential to improve both automated medical question answering and clinical decision support. These experimental results demonstrate that MedGo possesses strong information processing capabilities in the medical field. At present, we have successfully deployed MedGo at Shanghai East Hospital.
Computation and Language,Artificial Intelligence
What problem does this paper attempt to address?
The paper attempts to address the limitations of current large language models in medical applications, including insufficient accuracy and a narrow range of functions. Specifically, existing large language models often struggle to achieve high precision when handling medical tasks and perform poorly in interpretability and specific medical tasks such as disease classification, medical record generation, and knowledge extraction. To tackle these challenges, the paper proposes a large language model specifically designed for the Chinese medical field—MedGo. The main goals of MedGo are: 1. **Improve accuracy**: Enhance the model's accuracy and reliability in medical tasks through training with high-quality unsupervised, supervised, and preference-aligned data. 2. **Enhance versatility**: Enable the model to handle various medical information processing tasks, including automated medical Q&A and clinical decision support. 3. **Increase interpretability**: Ensure that the model's outputs have high interpretability so that medical professionals can understand the model's reasoning process. 4. **Practical application**: Validate the effectiveness and practicality of MedGo in real medical scenarios through evaluations on public benchmarks and self-built datasets. The paper demonstrates the potential of MedGo in improving the quality and efficiency of medical information processing by constructing large-scale medical datasets, optimizing training methods, and evaluating model performance.