Clustering‐based risk stratification of prediabetes populations: Insights from the Taiwan and UK Biobanks

Djeane Debora Onthoni,Ying‐Erh Chen,Yi‐Hsuan Lai,Guo‐Hung Li,Yong‐Sheng Zhuang,Hong‐Ming Lin,Yu‐Ping Hsiao,Ade Indra Onthoni,Hung‐Yi Chiou,Ren‐Hua Chung
DOI: https://doi.org/10.1111/jdi.14328
2024-10-12
Journal of Diabetes Investigation
Abstract:This research highlights how unsupervised learning effectively pinpoints risk‐specific groups among prediabetes populations using the Taiwan Biobank (TWB) and UK Biobank (UKB), revealing strong correlations with lifestyle factors and heightened susceptibility to diabetes complications. It underscores the necessity for personalized preventive measures, particularly targeting younger males with prediabetes and higher body mass index (BMI) and triglyceride levels in the TWB, where smoking cessation could notably lower the risk of developing type 2 diabetes (T2D). Aims/Introduction This study aimed to identify low‐ and high‐risk diabetes groups within prediabetes populations using data from the Taiwan Biobank (TWB) and UK Biobank (UKB) through a clustering‐based Unsupervised Learning (UL) approach, to inform targeted type 2 diabetes (T2D) interventions. Materials and Methods Data from TWB and UKB, comprising clinical and genetic information, were analyzed. Prediabetes was defined by glucose thresholds, and incident T2D was identified through follow‐up data. K‐means clustering was performed on prediabetes participants using significant features determined through logistic regression and LASSO. Cluster stability was assessed using mean Jaccard similarity, silhouette score, and the elbow method. Results We identified two stable clusters representing high‐ and low‐risk diabetes groups in both biobanks. The high‐risk clusters showed higher diabetes incidence, with 15.7% in TWB and 13.0% in UKB, compared to 7.3% and 9.1% in the low‐risk clusters, respectively. Notably, males were predominant in the high‐risk groups, constituting 76.6% in TWB and 52.7% in UKB. In TWB, the high‐risk group also exhibited significantly higher BMI, fasting glucose, and triglycerides, while UKB showed marginal significance in BMI and other metabolic indicators. Current smoking was significantly associated with increased diabetes risk in the TWB high‐risk group (P
endocrinology & metabolism
What problem does this paper attempt to address?