Abstract:Objective: The recent surge in large language models (LLMs) across various fields has yet to be fully realized in traditional Chinese medicine (TCM). This study aims to bridge this gap by developing a large language model tailored to TCM knowledge, enhancing its performance and accuracy in clinical reasoning tasks such as diagnosis, treatment, and prescription recommendations. Materials and methods: This study harnessed a wide array of TCM data resources, including TCM ancient books, textbooks, and clinical data, to create 3 key datasets: the TCM Pre-trained Dataset, the Traditional Chinese Patent Medicine (TCPM) Question Answering Dataset, and the Spleen and Stomach Herbal Prescription Recommendation Dataset. These datasets underpinned the development of the Lingdan Pre-trained LLM and 2 specialized models: the Lingdan-TCPM-Chat Model, which uses a Chain-of-Thought process for symptom analysis and TCPM recommendation, and a Lingdan Prescription Recommendation model (Lingdan-PR) that proposes herbal prescriptions based on electronic medical records. Results: The Lingdan-TCPM-Chat and the Lingdan-PR Model, fine-tuned on the Lingdan Pre-trained LLM, demonstrated state-of-the art performances for the tasks of TCM clinical knowledge answering and herbal prescription recommendation. Notably, Lingdan-PR outperformed all state-of-the-art baseline models, achieving an improvement of 18.39% in the Top@20 F1-score compared with the best baseline. Conclusion: This study marks a pivotal step in merging advanced LLMs with TCM, showcasing the potential of artificial intelligence to help improve clinical decision-making of medical diagnostics and treatment strategies. The success of the Lingdan Pre-trained LLM and its derivative models, Lingdan-TCPM-Chat and Lingdan-PR, not only revolutionizes TCM practices but also opens new avenues for the application of artificial intelligence in other specialized medical fields. Our project is available at https://github.com/TCMAI-BJTU/LingdanLLM.

DISC-MedLLM: Bridging General Large Language Models and Real-World Medical Consultation

TCMChat: A Generative Large Language Model for Traditional Chinese Medicine

Improving Clinical Expertise in Large Language Models Using Electronic Medical Records

DISC-LawLLM: Fine-tuning Large Language Models for Intelligent Legal Services

Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue

Lingdan: enhancing encoding of traditional Chinese medicine knowledge for clinical reasoning tasks with large language models

LLMs for Doctors: Leveraging Medical LLMs to Assist Doctors, Not Replace Them

DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning

MedChatZH: A tuning LLM for traditional Chinese medicine consultations

LLM-Mini-CEX: Automatic Evaluation of Large Language Model for Diagnostic Conversation

Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model

A Survey on Large Language Models from General Purpose to Medical Applications: Datasets, Methodologies, and Evaluations

ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge

AI Hospital: Interactive Evaluation and Collaboration of LLMs As Intern Doctors for Clinical Diagnosis

An Automatic Evaluation Framework for Multi-turn Medical Consultations Capabilities of Large Language Models

Customizing Large Language Models for Business Context: Framework and Experiments

Large Language Models Leverage External Knowledge to Extend Clinical Insight Beyond Language Boundaries

ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences

MultifacetEval: Multifaceted Evaluation to Probe LLMs in Mastering Medical Knowledge

CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios

LCMDC: Large-scale Chinese Medical Dialogue Corpora for Automatic Triage and Medical Consultation