Abstract:Purpose: Pathologists routinely analyze H&E-stained FFPE tissue slides microscopically to arrive at diagnoses for patients. Pathomics, the study of quantitative imaging from such samples, aims to elevate diagnostic accuracy by revealing intricate tissue and cellular details. Our research evaluates how pathomic features from tissue imagery can predict gene expression, as determined by RNA sequencing, offering advancements in the molecular profiling and targeted therapy for invasive breast carcinoma (IBC). Method: We conducted an analysis of 90 regions of interest (ROIs) on FFPE tissue slide images from the TCGA registry for patients with IBC. Using HistomicsTK software, we extracted nearly 300 pathomic features depicting cell morphometry, intensity, and gradient from these regions of interest (ROIs). We then assessed the intra-correlation of the pathomic features within the ROIs of the same tissue slide to identify the most heterogeneous and highly correlating features (n=20) across categories—morphometry, intensity, and gradient—with a Pearson's Correlation coefficient (r) greater than 0.9 and a FDR adjusted p-value less than 0.05. Subsequently, gene expression data in the form of FPKM were obtained for the corresponding tissues from TCGA. We trained a Multitask Elastic Net (MTEN) model on these pathomic features using an 80:20 split for the training and testing sets, with a three-fold cross-validation applied to the training set utilizing ImaGene software. The model's performance was evaluated by measuring the AUC and R2 value for the predicted gene expression on the testing set. Result and Biological significance: The testing of the MTEN model using a testing set predicted the expression of three genes identified in the literature as prognostic markers in breast cancer, namely ALDH1L2, MFAP5, and MXRA8, with an AUC greater than 0.8 and an R2 above 0.5 at a p-value of less than 0.002. According to the literature, expression of genes from the ALDH1 superfamily in breast cancer correlates with the stage of the disease, triple-negative status, and response to neoadjuvant therapy. The upregulation of MFAP5 in invasive breast carcinomas has been associated with high risk prognostic features, such as higher tumor grade and stage and increased angiogenesis, and poorer outcomes, such as lymph node metastasis. Furthermore, MXRA8 is involved in modulating the progression of human triple-negative breast cancer, likely through its influence on the interactions of tumor cells with their microenvironment. Conclusion: The present study predicts the expression of clinically relevant genes in IBC using a heterogeneous set of pathomic features extracted from FFPE tissue slide imagery. By linking cell and tissue-based morphometric, intensity, and gradient features with gene expression, researchers can gain insights into the molecular mechanisms underlying disease progression. Digital pathology and pathomics may reduce the need for additional genetic testing in critical patient-cases by providing predictive information from routinely acquired pathology slides. Bridging the gap between phenotypic tissue data and molecular data, the predictive capabilities of pathomic features represent a significant advancement in the field of precision medicine. Citation Format: Shrey S. Sukhadia, Digvijay Yadav, Kristen E. Muller. Transformative pathomics in oncology: Harnessing FFPE tissue slide imagery for clinically relevant gene expression prediction in invasive breast carcinoma [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 2 (Late-Breaking, Clinical Trial, and Invited s); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(7_Suppl) nr LB392.

hist2RNA: An efficient deep learning architecture to predict gene expression from breast cancer histopathology images

A deep learning model to predict RNA-Seq expression of tumours from whole slide images

Predicting Molecular Phenotypes from Histopathology Images: A Transcriptome-Wide Expression-Morphology Analysis in Breast Cancer

Spatial transcriptomics inferred from pathology whole-slide images links tumor heterogeneity to survival in breast and lung cancer

Deep learning-based classification of breast cancer molecular subtypes from H&E whole-slide images

Development, comparative study, and external validation of a new deep learning model for predicting genome-wide gene expression from histopathology slides.

Transcriptome-wide prediction of prostate cancer gene expression from histopathology images using co-expression based convolutional neural networks

Predicting Breast Cancer Gene Expression Signature by Applying Deep Convolutional Neural Networks From Unannotated Pathological Images

Estrogen Receptor Gene Expression Prediction from H&E Whole Slide Images

Breast Cancer Histopathology Image based Gene Expression Prediction using Spatial Transcriptomics data and Deep Learning

Deep learned tissue “fingerprints” classify breast cancers by ER/PR/Her2 status from H&E images

Cross-linking breast tumor transcriptomic states and tissue histology

Abstract 4270: Prediction of cancer transcriptomes from whole-slide images with Vis-Gene

HER2 and FISH Status Prediction in Breast Biopsy H&E-Stained Images Using Deep Learning

Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology

Machine learning enabled prediction of digital biomarkers from whole slide histopathology images

Artificial Intelligence Predicts Multiclass Molecular Signatures and Subtypes Directly from Breast Cancer Histology

Abstract B082: Early risk stratification of ER+/HER2– breast cancer patients using digital pathology and multi-task, weakly-supervised deep learning

Deep learning identifies morphological patterns of homologous recombination deficiency in luminal breast cancers from whole slide images

Abstract LB392: Transformative pathomics in oncology: Harnessing FFPE tissue slide imagery for clinically relevant gene expression prediction in invasive breast carcinoma

Data-Efficient Computational Pathology Platform for Faster and Cheaper Breast Cancer Subtype Identifications: Development of a Deep Learning Model