A CNN-based methodology for breast cancer diagnosis using thermal images

Juan Zuluaga-Gomez,Zeina Al Masry,Khaled Benaggoune,Safa Meraghni,Noureddine Zerhouni
DOI: https://doi.org/10.48550/arXiv.1910.13757
2019-10-30
Abstract:Micro Abstract: A recent study from GLOBOCAN disclosed that during 2018 two million women worldwide had been diagnosed from breast cancer. This study presents a computer-aided diagnosis system based on convolutional neural networks as an alternative diagnosis methodology for breast cancer diagnosis with thermal images. Experimental results showed that lower false-positives and false-negatives classification rates are obtained when data pre-processing and data augmentation techniques are implemented in these thermal images. Background: There are many types of breast cancer screening techniques such as, mammography, magnetic resonance imaging, ultrasound and blood sample tests, which require either, expensive devices or personal qualified. Currently, some countries still lack access to these main screening techniques due to economic, social or cultural issues. The objective of this study is to demonstrate that computer-aided diagnosis(CAD) systems based on convolutional neural networks (CNN) are faster, reliable and robust than other techniques. Methods: We performed a study of the influence of data pre-processing, data augmentation and database size versus a proposed set of CNN models. Furthermore, we developed a CNN hyper-parameters fine-tuning optimization algorithm using a tree parzen estimator. Results: Among the 57 patients database, our CNN models obtained a higher accuracy (92\%) and F1-score (92\%) that outperforms several state-of-the-art architectures such as ResNet50, SeResNet50 and Inception. Also, we demonstrated that a CNN model that implements data-augmentation techniques reach identical performance metrics in comparison with a CNN that uses a database up to 50\% bigger. Conclusion: This study highlights the benefits of data augmentation and CNNs in thermal breast images. Also, it measures the influence of the database size in the performance of CNNs.
Computer Vision and Pattern Recognition,Image and Video Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the early diagnosis of breast cancer, especially achieving this goal through thermography. Specifically, the researchers have developed a computer - aided diagnosis (CAD) system based on convolutional neural network (CNN) for diagnosing breast cancer using thermographic images. The paper focuses on the following aspects: 1. **Improving diagnostic efficiency and reliability**: By using thermography, the researchers hope to provide a faster, more reliable and more robust method for breast cancer diagnosis, especially in countries and regions with limited resources or poor economic conditions, where traditional breast cancer screening techniques (such as mammography, magnetic resonance imaging, etc.) may not be widely available. 2. **Application of data pre - processing and data augmentation techniques**: The researchers explored how to improve the performance of the CNN model and reduce the misdiagnosis rate (false positive rate and false negative rate) through data pre - processing and data augmentation techniques. 3. **Miniaturization and simplification of CNN architecture**: The paper points out that smaller and simpler CNN architectures can outperform existing complex architectures (such as ResNet50, SeResNet50 and Inception, etc.) in some cases, which provides a more lightweight solution for practical applications. 4. **Trade - off between data augmentation and database size**: The researchers also measured the impact of data augmentation techniques and database size on the performance of CNN, and found that data augmentation techniques can achieve performance comparable to that of a larger database on a smaller database. 5. **Hyper - parameter optimization**: In order to further improve the model performance, the researchers developed a hyper - parameter optimization algorithm based on Tree Parzen Estimator (TPE), and used the Bayesian optimization method to find the optimal combination of hyper - parameters. In conclusion, this paper aims to provide an efficient, reliable and easy - to - implement breast cancer diagnosis scheme through thermography and deep - learning methods, especially in areas where traditional screening techniques are difficult to popularize.