Longitudinal interpretability of deep learning based breast cancer risk prediction
Zan Klanecek,Yao-Kuan Wang,Tobias Wagner,Lesley Cockmartin,Nicholas W Marshall,Brayden Schott,Alison Deatsch,Andrej Studen,Katja Jarm,Mateja Krajc,Miloš Vrhovec,Hilde Bosmans,Robert Jeraj
DOI: https://doi.org/10.1088/1361-6560/ad9db3
IF: 3.5
2024-12-13
Physics in Medicine and Biology
Abstract:Objective: Deep-learning-based models have achieved state-of-the-art breast cancer risk (BCR) prediction performance. However, these models are highly complex, and the underlying mechanisms of BCR prediction are not fully understood. Key questions include whether these models can detect breast morphologic changes that lead to cancer. These findings would boost confidence in utilizing BCR models in practice and provide clinicians with new perspectives. In this work, we aimed to determine when oncogenic processes in the breast provide sufficient signal for the models to detect these changes.
Approach: In total, 1210 screening mammograms were collected for patients screened at different times before the cancer was screen-detected and 2400 mammograms for patients with at least ten years of follow-up. MIRAI, a BCR risk prediction model, was used to estimate the BCR. Attribution Heterogeneity was defined as the relative difference between the attributions obtained from the right and left breasts using one of the eight interpretability techniques. Model reliance on the side of the breast with cancer was quantified with AUC. The Mann-Whitney U test was used to check for significant differences in median absolute Attribution Heterogeneity between cancer patients and healthy individuals.
Results: All tested attribution methods showed a similar longitudinal trend, where the model reliance on the side of the breast with cancer was the highest for the 0-1 Years-To-Cancer interval (AUC=0.85–0.95), dropped for the 1-3 Years-To-Cancer interval (AUC=0.64–0.71), and remained above the threshold for random performance for the 3-5 Years-To-Cancer interval (AUC=0.51–0.58). For all eight attribution methods, the median values of absolute attribution heterogeneity were significantly larger for patients diagnosed with cancer at one point (p<0.01).
Significance: Interpretability of BCR prediction has revealed that long-term predictions (beyond three years) are most likely based on typical breast characteristics, such as breast density; for mid-term predictions (one to three years), the model appears to detect early signs of tumor development, while for short-term predictions (up to a year), the BCR model essentially functions as a breast cancer detection model.
engineering, biomedical,radiology, nuclear medicine & medical imaging