Machine learning reveals novel targets for both glioblastoma and osteosarcoma

Nan Li,Max Ward,Muniba Bashir,Yunpeng Cao,Amitava Datta,Zhaoyu Li,Shuang Zhang
DOI: https://doi.org/10.1101/2024.11.05.622056
2024-11-08
Abstract:Glioblastoma and osteosarcoma originate from the same lineage, yet patients with these two tumour types show significant differences in survival outcomes. Transcriptomic analysis comparing these tumours reveals that over 65% genes show similar expression patterns. Principal component analysis further demonstrates substantial similarities between these two tumour types, albeit with discernible differences. Deep learning analysis employing an autoencoder unveils nuanced distinctions and similarities of these two tumours at a high resolution. A classification model, leveraging gradient boosting with eXtreme Gradient Boosting (XGBoost), achieves high accuracy in distinguishing between these two tumour types. Identification of key contributors to the model performance is facilitated by SHapley Additive exPlanations (SHAP), yielding two lists of top target genes with and without considering gender. Notably, these SHAP targets tend to cluster within one or two networks of signalling pathways. Remarkably, gene expression levels of many of these SHAP targets alone can recapitulate survival differences solely based on clinical data between glioblastoma and osteosarcoma patients. Of particular interest, C2ORF72 emerges as a common target from both lists, representing an uncharacterised protein with promising potential as a novel target for diagnostic, prognostic, and therapeutic target for both glioblastoma and osteosarcoma.
Biology
What problem does this paper attempt to address?