Pretrained Vision Models for Predicting High-Risk Breast Cancer Stage

Bonaventure F. P. Dossou,Yenoukoume S. K. Gbenou,Miglanche Ghomsi Nono
2023-03-20
Abstract:Cancer is increasingly a global health issue. Seconding cardiovascular diseases, cancers are the second biggest cause of death in the world with millions of people succumbing to the disease every year. According to the World Health Organization (WHO) report, by the end of 2020, more than 7.8 million women have been diagnosed with breast cancer, making it the world's most prevalent cancer. In this paper, using the Nightingale Open Science dataset of digital pathology (breast biopsy) images, we leverage the capabilities of pre-trained computer vision models for the breast cancer stage prediction task. While individual models achieve decent performances, we find out that the predictions of an ensemble model are more efficient, and offer a winning solution\footnote{<a class="link-external link-https" href="https://www.nightingalescience.org/updates/hbc1-results" rel="external noopener nofollow">this https URL</a>}. We also provide analyses of the results and explore pathways for better interpretability and generalization. Our code is open-source at \url{<a class="link-external link-https" href="https://github.com/bonaventuredossou/nightingale_winning_solution" rel="external noopener nofollow">this https URL</a>}
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The main objective of this paper is to utilize pre-trained computer vision models to predict the staging of high-risk breast cancer. Specifically, the researchers used digital pathology images (breast biopsies) from the Nightingale Open Science dataset for this task. Although individual models showed some effectiveness in predicting breast cancer staging, the study found that ensemble models (Deep Ensemble) performed better, providing more efficient and reliable solutions. The research team also explored how causal inference methods can improve the model's interpretability, performance, and generalization ability, and they have open-sourced their code. This work not only demonstrates the potential application of pre-trained computer vision models in breast cancer staging prediction but also points out future research directions, including uncertainty estimation and the application of causal inference.