A Medical Multimodal Large Language Model for Pediatric Pneumonia

Weiwei Tian,Xinyu Huang,Tianhao Cheng,Wen He,Jinwu Fang,Rui Feng,Daoying Geng,Xiaobo Zhang
2024-09-04
Abstract:Pediatric pneumonia is the leading cause of death among children under five years worldwide, imposing a substantial burden on affected families. Currently, there are three significant hurdles in diagnosing and treating pediatric pneumonia. Firstly, pediatric pneumonia shares similar symptoms with other respiratory diseases, making rapid and accurate differential diagnosis challenging. Secondly, primary hospitals often lack sufficient medical resources and experienced doctors. Lastly, providing personalized diagnostic reports and treatment recommendations is labor-intensive and time-consuming. To tackle these challenges, we proposed a Medical Multimodal Large Language Model for Pediatric Pneumonia (P2Med-MLLM). It was capable of handling diverse clinical tasks, such as generating free-text radiology reports and medical records within a unified framework. Specifically, P2Med-MLLM can process both pure text and image-text data, trained on an extensive and large-scale dataset (P2Med-MD), including real clinical information from 163,999 outpatient and 8,684 inpatient cases. This dataset comprised 2D chest X-ray images, 3D chest CT images, corresponding radiology reports, and outpatient and inpatient records. We designed a three-stage training strategy to enable P2Med-MLLM to comprehend medical knowledge and follow instructions for various clinical tasks. To rigorously evaluate P2Med-MLLM's performance, we developed P2Med-MBench, a benchmark consisting of 642 meticulously verified samples by pediatric pulmonology specialists, covering six clinical decision-support tasks and a balanced variety of diseases. The automated scoring results demonstrated the superiority of P2Med-MLLM. This work plays a crucial role in assisting primary care doctors with prompt disease diagnosis and treatment planning, reducing severe symptom mortality rates, and optimizing the allocation of medical resources.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### The Problem the Paper Aims to Solve The paper aims to address three major challenges in the diagnosis and treatment of pediatric pneumonia: 1. **Symptom Similarity**: The symptoms of pediatric pneumonia are similar to those of other respiratory diseases (such as bronchitis, asthma, etc.), making rapid and accurate differential diagnosis very difficult. 2. **Insufficient Medical Resources**: Primary hospitals often lack sufficient medical resources and experienced doctors, leading to misdiagnosis and inappropriate treatment. 3. **Time-consuming Personalized Reports**: Generating personalized diagnostic reports and treatment recommendations requires a lot of manpower and time. To address these challenges, the research team proposed a multimodal large language model specifically for pediatric pneumonia (P2Med-MLLM). This model can handle various clinical tasks, such as generating free-text radiology reports and medical records, and processing both pure text and image-text data within a unified framework. P2Med-MLLM was trained on a large-scale dataset (P2Med-MD), which includes real clinical information from 163,999 outpatient cases and 8,684 inpatient cases, including 2D chest X-rays, 3D chest CT images, and their corresponding radiology reports and outpatient and inpatient records. Through a three-stage training strategy, P2Med-MLLM can understand medical knowledge and perform various clinical tasks. To evaluate the performance of P2Med-MLLM, the research team also developed a benchmark test set (P2Med-MBench), covering six clinical decision support tasks and balanced samples of various diseases. Experimental results show that P2Med-MLLM outperforms other open-source large language models in multiple tasks.