Mining incomplete clinical data for the early assessment of Kawasaki disease based on feature clustering and convolutional neural networks

Haolin Wang,Xuhai Tan,Zhilin Huang,Bo Pan,Jie Tian
DOI: https://doi.org/10.1016/j.artmed.2020.101859
Abstract:Kawasaki disease (KD) is the leading cause of acquired heart disease in children. Its prompt treatment can effectively lower the risk of severe complications, such as coronary aneurysms. However, accurately diagnosing KD at its early stage is impracticable given its unknown pathogenesis and lack of pathognomonic features. In this study, we investigated data-driven approaches by using a cohort of 10,367 patients extracted from electronic health records for early KD assessment. The incompleteness of clinical data presents group-based missing patterns associated with different clinical assessment measures. To address this problem, we developed a method integrating feature clustering to enable matrix-based representation and convolutional neural networks (CNN) for feature extraction and fusion to explicitly exploit the multi-source data structure. Integrating missing data imputation methods with the proposed method demonstrated superior accuracy (an AUC of 0.97) compared with a number of benchmark methods. The present method shows potential to improve clinical data mining. Our study highlighted the feasible utilization of matrix-based feature representation and CNN-based feature extraction for incomplete clinical data mining to support medical decision-making.
What problem does this paper attempt to address?