Classification of COVID-19 on chest X-Ray images using Deep Learning model with Histogram Equalization and Lungs Segmentation

Aman Swaraj,Karan Verma
DOI: https://doi.org/10.48550/arXiv.2112.02478
2022-07-11
Abstract:Background and Objective: Artificial intelligence (AI) methods coupled with biomedical analysis has a critical role during pandemics as it helps to release the overwhelming pressure from healthcare systems and physicians. As the ongoing COVID-19 crisis worsens in countries having dense populations and inadequate testing kits like Brazil and India, radiological imaging can act as an important diagnostic tool to accurately classify covid-19 patients and prescribe the necessary treatment in due time. With this motivation, we present our study based on deep learning architecture for detecting covid-19 infected lungs using chest X-rays. Dataset: We collected a total of 2470 images for three different class labels, namely, healthy lungs, ordinary pneumonia, and covid-19 infected pneumonia, out of which 470 X-ray images belong to the covid-19 category. Methods: We first pre-process all the images using histogram equalization techniques and segment them using U-net architecture. VGG-16 network is then used for feature extraction from the pre-processed images which is further sampled by SMOTE oversampling technique to achieve a balanced dataset. Finally, the class-balanced features are classified using a support vector machine (SVM) classifier with 10-fold cross-validation and the accuracy is evaluated. Result and Conclusion: Our novel approach combining well-known pre-processing techniques, feature extraction methods, and dataset balancing method, lead us to an outstanding rate of recognition of 98% for COVID-19 images over a dataset of 2470 X-ray images. Our model is therefore fit to be utilized in healthcare facilities for screening purposes.
Image and Video Processing,Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the classification accuracy of COVID - 19 on chest X - ray images. Specifically, the author points out several deficiencies in existing research: 1. **Lack of lung segmentation**: Most of the existing works do not consider lung segmentation, which may cause the model to learn features from non - infected areas, thus affecting the accuracy of classification. 2. **Imbalanced data set**: Many studies have used imbalanced data sets and adopted unrealistic augmentation techniques, which may lead to over - fitting. 3. **Differences in data sources**: X - ray images of COVID - 19 patients come from multiple different data sources, while images of normal and pneumonia patients usually come from a single data source. This difference may affect the generalization ability of the model. To overcome these problems, the author proposes a new method, which mainly includes the following steps: 1. **Data pre - processing**: - Use histogram equalization to enhance image quality. - Use U - Net for lung segmentation to ensure that the model only learns features from the lung area. 2. **Feature extraction**: - Use the VGG - 16 network to extract features from the pre - processed images. - Apply the SMOTE technique to oversample the feature vectors to balance the data set. 3. **Classification and evaluation**: - Use the support vector machine (SVM) classifier for multi - class classification. - Evaluate the performance of the model through 10 - fold cross - validation. Through these steps, the author aims to provide a more reliable and accurate COVID - 19 detection method. Eventually, this method achieved a recognition rate of 98% on 2,470 chest X - ray images, which is significantly better than other existing methods.