Clinical-Grade Detection of Microsatellite Instability in Colorectal Tumors by Deep Learning
Amelie Echle,Heike Irmgard Grabsch,Philip Quirke,Piet A van den Brandt,Nicholas P West,Gordon G A Hutchins,Lara R Heij,Xiuxiang Tan,Susan D Richman,Jeremias Krause,Elizabeth Alwers,Josien Jenniskens,Kelly Offermans,Richard Gray,Hermann Brenner,Jenny Chang-Claude,Christian Trautwein,Alexander T Pearson,Peter Boor,Tom Luedde,Nadine Therese Gaisa,Michael Hoffmeister,Jakob Nikolas Kather
DOI: https://doi.org/10.1053/j.gastro.2020.06.021
IF: 29.4
Gastroenterology
Abstract:Background & aims: Microsatellite instability (MSI) and mismatch-repair deficiency (dMMR) in colorectal tumors are used to select treatment for patients. Deep learning can detect MSI and dMMR in tumor samples on routine histology slides faster and less expensively than molecular assays. However, clinical application of this technology requires high performance and multisite validation, which have not yet been performed. Methods: We collected H&E-stained slides and findings from molecular analyses for MSI and dMMR from 8836 colorectal tumors (of all stages) included in the MSIDETECT consortium study, from Germany, the Netherlands, the United Kingdom, and the United States. Specimens with dMMR were identified by immunohistochemistry analyses of tissue microarrays for loss of MLH1, MSH2, MSH6, and/or PMS2. Specimens with MSI were identified by genetic analyses. We trained a deep-learning detector to identify samples with MSI from these slides; performance was assessed by cross-validation (N = 6406 specimens) and validated in an external cohort (n = 771 specimens). Prespecified endpoints were area under the receiver operating characteristic (AUROC) curve and area under the precision-recall curve (AUPRC). Results: The deep-learning detector identified specimens with dMMR or MSI with a mean AUROC curve of 0.92 (lower bound, 0.91; upper bound, 0.93) and an AUPRC of 0.63 (range, 0.59-0.65), or 67% specificity and 95% sensitivity, in the cross-validation development cohort. In the validation cohort, the classifier identified samples with dMMR with an AUROC of 0.95 (range, 0.92-0.96) without image preprocessing and an AUROC of 0.96 (range, 0.93-0.98) after color normalization. Conclusions: We developed a deep-learning system that detects colorectal cancer specimens with dMMR or MSI using H&E-stained slides; it detected tissues with dMMR with an AUROC of 0.96 in a large, international validation cohort. This system might be used for high-throughput, low-cost evaluation of colorectal tissue specimens.