PRISM: Mitigating EHR Data Sparsity Via Learning from Missing Feature Calibrated Prototype Patient Representations

Yinghao Zhu,Zixiang Wang,Long He,Shiyun Xie,Xiaochen Zheng,Liantao Ma,Chengwei Pan
DOI: https://doi.org/10.1145/3627673.3679521
2024-01-01
Abstract:Electronic Health Records (EHRs) provide valuable patient data but often suffer from sparsity issue, posing significant challenges in predictive modeling. Conventional imputation methods inadequately distinguish between real and imputed data, leading to potential inaccuracies of patient representations. To address these issues, we introduce PRISM, a framework that indirectly imputes data through prototype representations of similar patients, thus ensuring denser and more accurate embeddings. PRISM also includes a feature confidence learner module, which evaluates the reliability of each feature considering missing statuses. Additionally, it incorporates a new patient similarity metric that accounts for feature confidence, avoiding overreliance on imprecise imputed values. Our extensive experiments on the MIMIC-III, MIMIC-IV, PhysioNet Challenge 2012, eICU datasets demonstrate PRISM's superior performance in predicting in-hospital mortality and 30-day readmission tasks, showcasing its effectiveness in handling EHR data sparsity. For the sake of reproducibility and further research, we have publicly released the code at https://github.com/yhzhu99/PRISM.
What problem does this paper attempt to address?