Abstract:Background: Type 1 diabetes (T1D) is a devastating autoimmune disease, and its rising prevalence in the United States and around the world presents a critical problem in public health. While some treatment options exist for patients already diagnosed, individuals considered at risk for developing T1D and who are still in the early stages of their disease pathogenesis without symptoms have no options for any preventive intervention. This is because of the uncertainty in determining their risk level and in predicting with high confidence who will progress, or not, to clinical diagnosis. Biomarkers that assess one's risk with high certainty could address this problem and will inform decisions on early intervention, especially in children where the burden of justifying treatment is high. Single omics approaches (e.g., genomics, proteomics, metabolomics, etc.) have been applied to identify T1D biomarkers based on specific disturbances in association with the disease. However, reliable early biomarkers of T1D have remained elusive to date. To overcome this, we previously showed that parallel multi-omics provides a more comprehensive picture of the disease-associated disturbances and facilitates the identification of candidate T1D biomarkers. Methods: This paper evaluated the use of machine learning (ML) using data augmentation and supervised ML methods for the purpose of improving the identification of salient patterns in the data and the ultimate extraction of novel biomarker candidates in integrated parallel multi-omics datasets from a limited number of samples. We also examined different stages of data integration (early, intermediate, and late) to assess at which stage supervised parametric models can learn under conditions of high dimensionality and variation in feature counts across different omics. In the late integration scheme, we employed a multi-view ensemble comprising individual parametric models trained over single omics to address the computational challenges posed by the high dimensionality and variation in feature counts across the different yet integrated multi-omics datasets. Results: the multi-view ensemble improves the prediction of case vs. control and finds the most success in flagging a larger consistent set of associated features when compared with chance models, which may eventually be used downstream in identifying a novel composite biomarker signature of T1D risk. Conclusions: the current work demonstrates the utility of supervised ML in exploring integrated parallel multi-omics data in the ongoing quest for early T1D biomarkers, reinforcing the hope for identifying novel composite biomarker signatures of T1D risk via ML and ultimately informing early treatment decisions in the face of the escalating global incidence of this debilitating disease.

Supervised Parametric Learning in the Identification of Composite Biomarker Signatures of Type 1 Diabetes in Integrated Parallel Multi-Omics Datasets

Parallel Multi-Omics in High-Risk Subjects for the Identification of Integrated Biomarker Signatures of Type 1 Diabetes

Predicting type 2 diabetes via machine learning integration of multiple omics from human pancreatic islets

Metabolomics and Lipidomics Studies in Pediatric Type 1 Diabetes: Biomarker Discovery for the Early Diagnosis and Prognosis

Machine-learning to stratify diabetic patients using novel cardiac biomarkers and integrative genomics

Machine learning approach reveals microbiome, metabolome, and lipidome profiles in type 1 diabetes

AI-driven Integration of Multimodal Imaging Pixel Data and Genome-wide Genotype Data Enhances Precision Health for Type 2 Diabetes: Insights from a Large-scale Biobank Study

Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery

Machine Learning Reveals Metabolic Signatures in Patients with Type 1 Diabetes

185-OR: Machine-Learning Approach Reveals Microbiome, Metabolome, Lipidome, and Their Interaction in Type 1 Diabetes Mellitus

Simultaneous Modeling of Multiple Complications for Risk Profiling in Diabetes Care

Serological Phenotyping Analysis Uncovers a Unique Metabolomic Pattern Associated With Early Onset of Type 2 Diabetes Mellitus

Machine Learning as a Support for the Diagnosis of Type 2 Diabetes

Clinical, genomic, and proteomic perspectives in the analysis of comorbid conditions in type 2 diabetes mellitus: a retrospective study

Exploring new frontiers in type 1 diabetes through advanced mass-spectrometry-based molecular measurements

Genetic association and machine learning improves discovery and prediction of type 1 diabetes

Integration of metabolomics and transcriptomics data to aid biomarker discovery in type 2 diabetes

Fast Bayesian Integrative Learning of Multiple Gene Regulatory Networks for Type 1 Diabetes

Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis

A Three-gene-based Type 1 Diabetes Diagnostic Signature

Leveraging Gene Expression Data and Explainable Machine Learning for Enhanced Early Detection of Type 2 Diabetes