CoD, Towards an Interpretable Medical Agent using Chain of Diagnosis

Junying Chen,Chi Gui,Anningzhe Gao,Ke Ji,Xidong Wang,Xiang Wan,Benyou Wang
2024-09-15
Abstract:The field of medical diagnosis has undergone a significant transformation with the advent of large language models (LLMs), yet the challenges of interpretability within these models remain largely unaddressed. This study introduces Chain-of-Diagnosis (CoD) to enhance the interpretability of LLM-based medical diagnostics. CoD transforms the diagnostic process into a diagnostic chain that mirrors a physician's thought process, providing a transparent reasoning pathway. Additionally, CoD outputs the disease confidence distribution to ensure transparency in decision-making. This interpretability makes model diagnostics controllable and aids in identifying critical symptoms for inquiry through the entropy reduction of confidences. With CoD, we developed DiagnosisGPT, capable of diagnosing 9604 diseases. Experimental results demonstrate that DiagnosisGPT outperforms other LLMs on diagnostic benchmarks. Moreover, DiagnosisGPT provides interpretability while ensuring controllability in diagnostic rigor.
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the issue of interpretability in AI medical diagnosis. Specifically: 1. **Arbitrary Diagnosis**: Existing large language models (LLMs) like GPT-4 tend to make diagnoses directly without sufficiently inquiring about symptoms, which can lead to inaccurate diagnostic results. 2. **Ineffective Inquiry**: Even with subsequent inquiries, the diagnostic accuracy of LLMs does not significantly improve and may even decline on certain datasets. This indicates that LLMs have weak capabilities in symptom inquiry. To tackle these issues, the paper proposes the **Chain of Diagnosis (CoD)** method to enhance the interpretability and controllability of LLMs in disease diagnosis. CoD achieves this goal through the following means: - **Transparent Decision Process**: Transforming the diagnostic process into a chain similar to a doctor's thought process, providing a transparent reasoning path. - **Confidence Distribution**: Outputting the confidence distribution of diseases to ensure decision transparency and controlling the model's decisions through confidence thresholds. - **Entropy Reduction**: Guiding effective symptom inquiry by reducing the entropy of confidence, thereby improving diagnostic accuracy. Through these methods, CoD not only improves the diagnostic performance of the model but also enhances its interpretability and controllability. Experimental results show that DiagnosisGPT, developed based on CoD, performs excellently in multiple automatic diagnostic benchmarks, particularly in diagnostic accuracy and multi-round decision-making capabilities.