Enabling Patient-side Disease Prediction via the Integration of Patient Narratives

Zhixiang Su,Yinan Zhang,Jiazheng Jing,Jie Xiao,Zhiqi Shen
2024-05-05
Abstract:Disease prediction holds considerable significance in modern healthcare, because of its crucial role in facilitating early intervention and implementing effective prevention measures. However, most recent disease prediction approaches heavily rely on laboratory test outcomes (e.g., blood tests and medical imaging from X-rays). Gaining access to such data for precise disease prediction is often a complex task from the standpoint of a patient and is always only available post-patient consultation. To make disease prediction available from patient-side, we propose Personalized Medical Disease Prediction (PoMP), which predicts diseases using patient health narratives including textual descriptions and demographic information. By applying PoMP, patients can gain a clearer comprehension of their conditions, empowering them to directly seek appropriate medical specialists and thereby reducing the time spent navigating healthcare communication to locate suitable doctors. We conducted extensive experiments using real-world data from Haodf to showcase the effectiveness of PoMP.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issues present in current disease prediction methods, which heavily rely on laboratory test results (such as blood tests, medical imaging, etc.), and these data are usually only available after the patient consults a doctor. To solve this problem, the study proposes a personalized medical disease prediction model (PoMP), which predicts diseases solely based on the patient's health narratives (including text descriptions and demographic information). This approach allows patients to quickly understand their possible health conditions without professional medical data and directly find suitable medical experts, thereby simplifying the process of finding the appropriate department and reducing the time and effort required to find the right doctor within the medical system. By introducing a two-layer classification architecture, PoMP first predicts the major category of the disease and then further refines it to specific diseases, thereby improving the accuracy of the prediction. Experimental results show that PoMP outperforms existing pre-trained language models on multiple evaluation metrics. Additionally, the research team collected a dataset named Haodf to validate the effectiveness of the PoMP model and made the dataset and source code publicly available to facilitate future research work. In summary, this work provides new ideas and technical support for disease prediction based on patient narratives.