Joint Clinical And Molecular Subtyping Of Copd With Variational Autoencoders

Enrico Maiorino,Margherita De Marzio,Scott T. Weiss,Edwin K. Silverman,Peter J. Castaldi,Kimberly Glass,Maiorino,E.,De Marzio,M.,Weiss,S.,Silverman,E.,Castaldi,P.,Glass,K.
DOI: https://doi.org/10.1101/2023.08.19.23294298
2023-08-21
MedRxiv
Abstract:Chronic Obstructive Pulmonary Disease (COPD) is a complex, heterogeneous disease. Traditional subtyping methods generally focus on either the clinical manifestations or the molecular endotypes of the disease, resulting in domain-specific classifications that may not capture its full complexity. Here, we introduce an integrative approach based on variational autoencoders to integrate clinical and blood gene expression data from the COPDGene cohort study. We generate Personalized Integrated Profiles (PIPs) that recapitulate the joint clinical and molecular state of each individual in the population. Through prediction experiments we show that the PIPs encode the complex disease state of each individual in a compact representation, with an accuracy comparable or better than other embedding approaches. Through these profiles we study the space of continuous variation of COPD features by using graph-based trajectory learning techniques, and delineate five well-separated subtypes. The identified subtypes exhibit distinct phenotypes, expression signatures, and disease outcomes. Overall, our findings show that integrating clinical and molecular data is beneficial for gaining a more comprehensive understanding of COPD heterogeneity.
What problem does this paper attempt to address?