Abstract:Deep neural networks (DNNs) that predict mutational status from H&E slides of cancers can enable inexpensive and timely precision oncology. Although expert knowledge is reliable for annotating regions informative of malignancy and other known histological patterns (strong supervision), it is unreliable for identifying regions informative of mutational status. This poses a serious impediment to obtaining higher prognostic accuracy and discovering new knowledge of pathobiology.We used a weakly supervised learning technique to train a DNN to predict BRAF V600E mutational status, determined using DNA testing, in H&E stained images of thyroid cancer tissue without regional annotations. Our discovery cohort was a tissue microarray of only 85 patients from a single hospital. Yet, on a large independent external cohort of 444 patients from other hospitals, the trained model gave an AUC = 0.98 (95% CI: 0.97–1.00), which is much higher than the previously reported results for detecting any mutation using H&E by DNNs trained using strong supervision. We also developed a visualization technique that can automatically highlight regions the DNN found most informative for predicting mutational status. Our visualization is spatially granular and highly specific in highlighting regions with strong negative and positive regions and move us towards explainable artificial intelligence. Using t-tests, we confirmed that the proportions of follicular or papillary histology and oncocytic cytology, as noted for each patient by a pathologist who was blinded to the mutational status, were significantly different between mutated and wildtype patients. However, based solely on these features noted by the pathologist, a logistic regression classifier gave an average AUC = 0.78 in 5-fold CV, which is much lower than that obtained using the DNN. These results highlight the potential of weakly supervised learning for training DNN models for problems where the informative visual patterns and their locations are not known a priori.This article is protected by copyright. All rights reserved.

Weakly supervised learning on unannotated H&E‐stained slides predicts BRAF mutation in thyroid cancer with high accuracy