Large-Scale Deep Learning for Metastasis Detection in Pathology Reports

Zachary R Fox,Patrycja Krawczuk,Valentina Petkov,Serban Negoita,Jennifer Doherty,Antoinette Stroupe,Stephen Schwartz,Lynne Penberthy,Elizabeth Hsu,John Gounley,Heidi Hanson
DOI: https://doi.org/10.1101/2024.12.12.24318789
2024-12-13
Abstract:No existing algorithm can reliably identify metastasis from pathology reports across multiple cancer types and the entire US population. In this study, we develop a deep learning model that automatically detects patients with metastatic cancer by using pathology reports from many laboratories and of multiple cancer types. We trained and validated our model on a cohort of 29,632 patients from four Surveillance, Epidemiology, and End Results (SEER) registries linked to 60,471 unstructured pathology reports. Our deep learning architecture trained on task-specific data outperforms a general-purpose LLM, with a recall of 0.894 compared to 0.824. We quantified model uncertainty and used it to defer reports for human review. We found that retaining 72.9% of reports increased recall from 0.894 to 0.969. This approach could streamline population-based cancer surveillance to help address the unmet need to capture recurrence or progression.
What problem does this paper attempt to address?