DHDIP: An interpretable model for hypertension and hyperlipidemia prediction based on EMR data

Bin Liao,Xiaoyao Jia,Tao Zhang,Ruina Sun
DOI: https://doi.org/10.1016/j.cmpb.2022.107088
IF: 6.1
2022-11-01
Computer Methods and Programs in Biomedicine
Abstract:Background and objectiveTraditional hypertension and hyperlipidemia prediction models suffer from uneven modeling data sources, small sample sizes, and a lack of uniform standards for the index system, resulting in the model failing to fulfill clinical applications. To address this issue, this work will offer DHDIP, an interpretable hypertension and hyperlipidemia prediction model based on EMR data.MethodsFirst, we will select massive high-dimensional, unstructured EMR data as a unified modeling data source, and propose a pre-processing algorithm for EMR data to solve the problem that EMR data cannot be directly processed by machine learning algorithms. Second, a variety of mainstream models such as XGBoost, CatBoost, and RandomForest are selected for modeling, and the best adaptation algorithms are identified by performance comparison. Finally, the SHAP framework was introduced into the DHDIP model, thus identifying the main factors contributing to hypertension and hyperlipidemia, effectively enhancing the interpretability of the model.ResultsThe DHDIP model's MSE value is 0.0285, and its LOSS value is 0.0054, both of which are better than previous studies.ConclusionThe model balances performance and interpretability. Multi-objective learning allows for a more thorough analysis and prediction of the condition, which not only lowers the cost of disease prediction but also aids physicians in clinical diagnosis. In addition, the datasets and source code are available from this link: https://github.com/Xiaoyao-Jia/DHDIP
engineering, biomedical,computer science, interdisciplinary applications,medical informatics, theory & methods
What problem does this paper attempt to address?