Estrogen Receptor Gene Expression Prediction from H&E Whole Slide Images

Anvita A. Srinivas,Ronnachai Jaroensri,Ellery Wulczyn,James H. Wren,Elaine E. Thompson,Niels Olson,Fabien Beckers,Melissa Miao,Yun Liu,Po-Hsuan Cameron Chen,David F. Steiner
DOI: https://doi.org/10.1101/2024.04.05.24302951
2024-04-09
Abstract:Gene expression profiling (GEP) provides valuable information for the care of breast cancer patients. However, the test itself is expensive and can take a long time to process. In contrast, microscopic examination of hematoxylin and eosin (H&E) stained tissue is inexpensive, fast, and integrated into the standard of care. This work explores the possibility of predicting gene expression from H&E images, and its use in predicting clinical variables and patient outcomes. We utilized a weakly supervised method to train a deep learning model to predict expression from whole slide images, and achieved 0.57 [95% CI: 0.46, 0.67] Pearson’s correlation with the ground truth value. Our expression prediction achieved an AUROC of 0.81 [0.74, 0.87] in predicting clinical ER status obtained using an immunohistochemistry staining technique, and a c-index of 0.59 [0.52, 0.65] in predicting progression-free interval for the patients in our cohort. This work further demonstrates the potential to infer gene expression from H&E stained images in a manner that shows meaningful associations with clinical variables. Because obtaining H&E stained images is substantially easier and faster than genetic testing, the capability to derive molecular genetic information from these images may increase access to this type of information for patient risk stratification and provide research insights into molecular-morphological associations.
Pathology
What problem does this paper attempt to address?