Individualised prediction of chronic kidney disease for the elderly in longevity areas in China: machine learning approaches (Preprint)

Dai Su,Xingyu Zhang,Kevin He,Yingchun Chen
DOI: https://doi.org/10.2196/preprints.24674
2020-09-30
Abstract:BACKGROUND Chronic kidney disease (CKD) has become a major public health problem worldwide and has caused a huge social and economic burden, especially in developing countries. No previous study has utilised machine learning (ML) methods combined with longitudinal data to predict the risk of CKD development in two years amongst the elderly in China. OBJECTIVE To predict CKD amongst the elderly in longevity areas in China by using five ML models. METHODS This study was based on the panel data of 925 elderly individuals in the 2012 baseline survey and 2014 follow-up survey of the HABCS database. Six ML models were developed to predict the probability of CKD amongst the elderly in two years. The receiver operating curve and decision curve analysis were used to evaluate the prediction accuracy of the reference and ML models. RESULTS Amongst the 925 elderly in the HABCS 2014 survey, 289 (18.8%) had CKD. Compared with the other models, LR, lasso regression, RF, GBDT and DNN had no statistical significance (AUC > 0.7), and SVM exhibited the lowest predictive performance (AUC = 0.633, p-value = 0.057). DNN had the highest PPV (0.328), whereas LR had the lowest (0.287). Decision curve analysis indicated that within the threshold ranges of approximately 0–0.03 and 0.37–0.40, the net benefit of GBDT was the largest. Within the threshold ranges of approximately 0.03–0.10 and 0.26–0.30, the net benefit of RF was the largest. Age was the most important predictor variable in the RF and GBDT models. Blood urea nitrogen, serum albumin, uric acid, BMI, marital status, ADL/IADL and gender were crucial in predicting CKD in the elderly. CONCLUSIONS The ML model could successfully capture the linear and nonlinear relationships of risk factors for CKD in the elderly. The decision support system based on the predictive model in this research can help medical staff detect and intervene in the health of the elderly early.
What problem does this paper attempt to address?