Comparison of the effects of imputation methods for missing data in predictive modelling of cohort study datasets

JiaHang Li,ShuXia Guo,RuLin Ma,Jia He,XiangHui Zhang,DongSheng Rui,YuSong Ding,Yu Li,LeYao Jian,Jing Cheng,Heng Guo
DOI: https://doi.org/10.1186/s12874-024-02173-x
2024-02-17
BMC Medical Research Methodology
Abstract:Missing data is frequently an inevitable issue in cohort studies and it can adversely affect the study's findings. We assess the effectiveness of eight frequently utilized statistical and machine learning (ML) imputation methods for dealing with missing data in predictive modelling of cohort study datasets. This evaluation is based on real data and predictive models for cardiovascular disease (CVD) risk.
health care sciences & services
What problem does this paper attempt to address?