Breast Cancer Molecular Subtype Prediction: A Mammography-Based AI Approach

Ana M. Mota,João Mendes,Nuno Matela
DOI: https://doi.org/10.3390/biomedicines12061371
IF: 4.757
2024-06-21
Biomedicines
Abstract:Breast cancer remains a leading cause of mortality among women, with molecular subtypes significantly influencing prognosis and treatment strategies. Currently, identifying the molecular subtype of cancer requires a biopsy—a specialized, expensive, and time-consuming procedure, often yielding to results that must be supported with additional biopsies due to technique errors or tumor heterogeneity. This study introduces a novel approach for predicting breast cancer molecular subtypes using mammography images and advanced artificial intelligence (AI) methodologies. Using the OPTIMAM imaging database, 1397 images from 660 patients were selected. The pretrained deep learning model ResNet-101 was employed to classify tumors into five subtypes: Luminal A, Luminal B1, Luminal B2, HER2, and Triple Negative. Various classification strategies were studied: binary classifications (one vs. all others, specific combinations) and multi-class classification (evaluating all subtypes simultaneously). To address imbalanced data, strategies like oversampling, undersampling, and data augmentation were explored. Performance was evaluated using accuracy and area under the receiver operating characteristic curve (AUC). Binary classification results showed a maximum average accuracy and AUC of 79.02% and 64.69%, respectively, while multi-class classification achieved an average AUC of 60.62% with oversampling and data augmentation. The most notable binary classification was HER2 vs. non-HER2, with an accuracy of 89.79% and an AUC of 73.31%. Binary classification for specific combinations of subtypes revealed an accuracy of 76.42% for HER2 vs. Luminal A and an AUC of 73.04% for HER2 vs. Luminal B1. These findings highlight the potential of mammography-based AI for non-invasive breast cancer subtype prediction, offering a promising alternative to biopsies and paving the way for personalized treatment plans.
biochemistry & molecular biology,medicine, research & experimental,pharmacology & pharmacy
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to predict the molecular subtypes of breast cancer from mammography images by using artificial intelligence technology, so as to provide a non - invasive alternative method to reduce the need for biopsy. Currently, determining the molecular subtypes of breast cancer requires biopsy, which is a highly specialized, costly and time - consuming process. Moreover, due to technical errors or tumor heterogeneity, additional biopsies are often required to support the results. The paper proposes a new method based on mammography images and advanced artificial intelligence methods, aiming to predict the molecular subtypes of breast cancer by non - invasive means, thus making it possible to provide personalized treatment plans. Specifically, the research team used 1,397 images from 660 patients in the OPTIMAM imaging database and utilized the pre - trained deep - learning model ResNet - 101 to classify tumors into five subtypes: Luminal A, Luminal B1, Luminal B2, HER2 and Triple Negative. To address the problem of data imbalance, the researchers explored strategies such as oversampling, undersampling and data augmentation, and used accuracy and the area under the receiver operating characteristic curve (AUC) to evaluate performance. The research results show that in the binary classification task, the classification of HER2 and non - HER2 performs the best, with an accuracy rate of 89.79% and an AUC of 73.31%. In the multi - classification task, through oversampling and data augmentation, the average AUC reaches 60.62%. These findings indicate that the artificial intelligence method based on mammography has the potential in non - invasive prediction of breast cancer molecular subtypes, and can be used as a supplement or alternative to biopsy, which is helpful for formulating personalized treatment plans.