Deep Learning-Based Multimodal Clustering Model for Endotyping and Post-Arthroplasty Response Classification using Knee Osteoarthritis Subject-Matched Multi-Omic Data

Jason S. Rockel,Divya Sharma,Osvaldo Espin-Garcia,Katrina Hueniken,Amit Sandhu,Chiara Pastrello,Kala Sundararajan,Pratibha Potla,Noah Fine,Starlee S. Lively,Kimberly Perry,Nizar N Mahomed,Khalid Syed,Igor Jurisica,Anthony V. Perruccio,Y. Raja Rampersaud,Rajiv Gandhi,Mohit Kapoor
DOI: https://doi.org/10.1101/2024.06.13.24308857
2024-06-13
Abstract:Background: Primary knee osteoarthritis (KOA) is a heterogeneous disease with clinical and molecular contributors. Biofluids contain microRNAs and metabolites that can be measured by omic technologies. Deep learning captures complex non-linear associations within multimodal data but, to date, has not been used for multi-omic-based endotyping of KOA patients. We developed a novel multimodal deep learning framework for clustering of multi-omic data from three subject-matched biofluids to identify distinct KOA endotypes and classify one-year post-total knee arthroplasty (TKA) pain/function responses. Materials and Methods: In 414 KOA patients, subject-matched plasma, synovial fluid and urine were analyzed by microRNA sequencing or metabolomics. Integrating 4 high-dimensional datasets comprising metabolites from plasma (n=151 features), along with microRNAs from plasma (n=421), synovial fluid (n=930), or urine (n=1225), a multimodal deep learning variational autoencoder architecture with K-means clustering was employed. Features influencing cluster assignment were identified and pathway analyses conducted. An integrative machine learning framework combining 4 molecular domains and a clinical domain was then used to classify WOMAC pain/function responses post-TKA within each cluster. Findings: Multimodal deep learning-based clustering of subjects across 4 domains yielded 3 distinct patient clusters. Feature signatures comprising microRNAs and metabolites across biofluids included 30, 16, and 24 features associated with Clusters 1-3, respectively. Pathway analyses revealed distinct pathways associated with each cluster. Integration of 4 multi-omic domains along with clinical data improved response classification performance, with Cluster 3 achieving AUC=0.879 for subject pain response classification and Cluster 2 reaching AUC=0.808 for subject function response, surpassing individual domain classifications by 12% and 15% respectively. Interpretation: We have developed a deep learning-based multimodal clustering model capable of integrating complex multi-fluid, multi-omic data to assist in KOA patient endotyping and test outcome response to TKA surgery.
Orthopedics
What problem does this paper attempt to address?