Predicting Tissue of Origin from Bulk Tumor Gene Expression using a Pre-trained Transformer Model

Ted Mellors,Mehran Spitmann
DOI: https://doi.org/10.1101/2024.12.01.626105
2024-12-05
Abstract:Identifying the tissue of origin for cancers is essential for enhancing diagnostic precision, selecting effective treatments, and guiding clinical decision-making. In this study, we developed a predictive model to classify the tissue of origin across various cancer types. Using a dataset with 10,300 samples from 32 unique tissue types, the model achieved an overall accuracy of 88% in distinguishing among all 32 classes, with an average accuracy of 99.2% within each class. When tested on metastatic skin tumors, it reached an accuracy of 87%, underscoring its potential in addressing challenging metastatic cases. These results demonstrate the model's reliability in oncology applications, offering a promising tool for improving diagnostic accuracy and supporting personalized cancer treatment strategies.
Biology
What problem does this paper attempt to address?