Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models

Lulu Zhao,Weihao Zeng,Xiaofeng Shi,Hua Zhou,Donglin Hao,Yonghua Lin

2024-06-18

Abstract:Recently, both closed-source LLMs and open-source communities have made significant strides, outperforming humans in various general domains. However, their performance in specific professional fields such as medicine, especially within the open-source community, remains suboptimal due to the complexity of medical knowledge. We propose Aquila-Med, a bilingual medical LLM based on Aquila, addressing these challenges through continue pre-training, supervised fine-tuning (SFT), and reinforcement learning from human feedback (RLHF). We construct a large-scale Chinese and English medical dataset for continue pre-training and a high-quality SFT dataset, covering extensive medical specialties. Additionally, we develop a high-quality Direct Preference Optimization (DPO) dataset for further alignment. Aquila-Med achieves notable results across single-turn, multi-turn dialogues, and medical multiple-choice questions, demonstrating the effectiveness of our approach. We open-source the datasets and the entire training process, contributing valuable resources to the research community. Our models and datasets will released at <a class="link-external link-https" href="https://huggingface.co/BAAI/AquilaMed-RL" rel="external noopener nofollow">this https URL</a>.

Computation and Language,Artificial Intelligence

What problem does this paper attempt to address?

This paper proposes a solution to the underperformance of specialized language models in the medical field. Currently, although large-scale language models (LLMs) in general domains have surpassed human performance in many aspects, their performance in specific professional fields such as medicine is still suboptimal, especially in open-source community models. In the paper, the authors propose a bilingual medical LLM based on Aquila, named Aquila-Med, to enhance model performance through continued pre-training, supervised fine-tuning (SFT), and reinforcement learning based on human feedback (RLHF). First, they construct a large-scale Chinese and English medical dataset for continued pre-training and create a high-quality SFT dataset covering a wide range of medical specialties, consisting of approximately 330,000 examples. Additionally, they develop a high-quality dataset for direct preference optimization (DPO) including question-answering and multiple-choice questions in medicine. Aquila-Med demonstrates excellent performance in tasks such as single-turn dialogue, multi-turn dialogue, and medical multiple-choice questions, proving the effectiveness of the proposed methods. The paper also open-sources the datasets and the entire training process to promote the development of the research community. The model and datasets will be released on Hugging Face, providing resources for other researchers. In summary, the paper aims to address the accuracy and safety issues of specialized language models in the medical field by improving the model's understanding and application capabilities of complex medical knowledge through innovative approaches.

Aqulia-Med LLM: Pioneering Full-Process Open-Source Medical Language Models

Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

PMC-LLaMA: Towards Building Open-source Language Models for Medicine

Towards Building Multilingual Language Model for Medicine

PMC-LLaMA: toward building open-source language models for medicine

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue

Improving Clinical Expertise in Large Language Models Using Electronic Medical Records

Apollo: A Lightweight Multilingual Medical LLM towards Democratizing Medical AI to 6B People

MedChatZH: A tuning LLM for traditional Chinese medicine consultations

UltraMedical: Building Specialized Generalists in Biomedicine

A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations

Me LLaMA: Foundation Large Language Models for Medical Applications

OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models

LLMs for Doctors: Leveraging Medical LLMs to Assist Doctors, Not Replace Them

MedChatZH: a Better Medical Adviser Learns from Better Instructions

LlamaCare: A Large Medical Language Model for Enhancing Healthcare Knowledge Sharing

Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach

Application and technology of an open source AI large language model in the medical field

AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator

MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning