Abstract:Aims: Early prevention and treatment of type 2 diabetes mellitus (T2DM) is still a huge challenge for patients and clinicians. Recently, a novel cluster-based diabetes classification was proposed which may offer the possibility to solve this problem. In this study, we report our performance of cluster analysis of individuals newly diagnosed with T2DM, our exploration of each subtype's clinical characteristics and medication treatment, and the comparison carried out concerning the risk for diabetes complications and comorbidities among subtypes by adjusting for influencing factors. We hope to promote the further application of cluster analysis in individuals with early-stage T2DM. Methods: In this study, a k-means cluster algorithm was applied based on five indicators, namely, age, body mass index (BMI), glycosylated hemoglobin (HbA1c), homeostasis model assessment-2 insulin resistance (HOMA2-IR), and homeostasis model assessment-2 β-cell function (HOMA2-β), in order to perform the cluster analysis among 567 newly diagnosed participants with T2DM. The clinical characteristics and medication of each subtype were analyzed. The risk for diabetes complications and comorbidities in each subtype was compared by logistic regression analysis. Results: The 567 patients were clustered into four subtypes, as follows: severe insulin-deficient diabetes (SIDD, 24.46%), age-related diabetes (MARD, 30.86%), mild obesity-related diabetes (MOD, 25.57%), and severe insulin-resistant diabetes (SIRD, 20.11%). According to the results of the oral glucose tolerance test (OGTT) and biochemical indices, fasting blood glucose (FBG), 2-hour postprandial blood glucose (2hBG), HbA1c, total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C) and triglyceride-glucose index (TyG) were higher in SIDD and SIRD than in MARD and MOD. MOD had the highest fasting C-peptide (FCP), 2-hour postprandial C-peptide (2hCP), fasting insulin (FINS), 2-hour postprandial insulin (2hINS), serum creatinine (SCr), and uric acid (UA), while SIRD had the highest triglycerides (TGs) and TyG-BMI. Albumin transaminase (ALT) and albumin transaminase (AST) were higher in MOD and SIRD. As concerms medications, compared to the other subtypes, SIDD had a lower rate of metformin use (39.1%) and a higher rate of α-glucosidase inhibitor (AGI, 61.7%) and insulin (74.4%) use. SIRD showed the highest frequency of use of sodium-glucose cotransporter-2 inhibitors (SGLT-2i, 36.0%) and glucagon-like peptide-1 receptor agonists (GLP-1RA, 19.3%). Concerning diabetic complications and comorbidities, the prevalence of diabetic kidney disease (DKD), cardiovascular disease (CVD), non-alcoholic fatty liver disease (NAFLD), dyslipidemia, and hypertension differed significantly among subtypes. Employing logistic regression analysis, after adjusting for unmodifiable (sex and age) and modifiable related influences (e.g., BMI, HbA1c, and smoking), it was found that SIRD had the highest risk of developing DKD (odds ratio, OR = 2.001, 95% confidence interval (CI): 1.125-3.559) and dyslipidemia (OR = 3.550, 95% CI: 1.534-8.215). MOD was more likely to suffer from NAFLD (OR = 3.301, 95%CI: 1.586-6.870). Conclusions: Patients with newly diagnosed T2DM can be successfully clustered into four subtypes with different clinical characteristics, medication treatment, and risks for diabetes-related complications and comorbidities, the cluster-based diabetes classification possibly being beneficial both for prevention of secondary diabetes and for establishment of a theoretical basis for precision medicine.

Machine learning-based reproducible prediction of type 2 diabetes subtypes

Identifying subtypes of type 2 diabetes mellitus with machine learning: development, internal validation, prognostic validation and medication burden in linked electronic health records in 420 448 individuals

Prediction of Type 2 Diabetes Based on Machine Learning Algorithm

Validation of type 2 diabetes subgroups by simple clinical parameters: a retrospective cohort study of NHANES data from 1999 to 2014

Enhancing Type 2 Diabetes Treatment Decisions With Interpretable Machine Learning Models for Predicting Hemoglobin A1c Changes: Machine Learning Model Development

Prediction of type 1 diabetes using a genetic risk model in the Diabetes Autoimmunity Study in the Young.

1233-P: Prediction of Type 2 Diabetes Occurrence Using Machine Learning Model

Assessing reproducibility and utility of clustering of patients with type 2 diabetes and established CV disease (SAVOR -TIMI 53 trial)

Etiologies underlying subtypes of long-standing type 2 diabetes

Machine learning based predictive model of Type 2 diabetes complications using Malaysian National Diabetes Registry: A study protocol

An enhanced machine learning algorithm for type 2 diabetes prognosis with a detailed examination of Key correlates

Predicting Type 2 Diabetes Metabolic Phenotypes Using Continuous Glucose Monitoring and a Machine Learning Framework

Clinical application of cluster analysis in patients with newly diagnosed type 2 diabetes

Translating Subphenotypes of Newly Diagnosed Type 2 Diabetes from Cohort Studies to Electronic Health Records in the United States

Machine-learning to stratify diabetic patients using novel cardiac biomarkers and integrative genomics

Predicting diabetic retinopathy and identifying interpretable biomedical features using machine learning algorithms

Subtypes of Type 2 Diabetes Determined From Clinical Parameters

Effective questionnaire-based prediction models for type 2 diabetes across several ethnicities: a model development and validation study

Machine Learning Approach with Harmonized Multinational Datasets for Enhanced Prediction of Hypothyroidism in Patients with Type 2 Diabetes

1290-P: Developing Machine Learning Model for Predicting Acute Coronary Syndrome in Type 2 Diabetes Mellitus Patients through Substitution of Propensity Scores for Binary Variables

Identification of novel population clusters with different susceptibilities to type 2 diabetes and their impact on the prediction of diabetes