Two-way threshold-based intelligent water drops feature selection algorithm for accurate detection of breast cancer

Dhruba Jyoti Kalita,Vibhav Prakash Singh,Vinay Kumar
DOI: https://doi.org/10.1007/s00500-021-06498-3
IF: 3.732
2021-11-23
Soft Computing
Abstract:Breast cancer is one of the common reasons for deaths of women over the globe. It has been found that a Computer-Aided Diagnosis (CAD) system can be designed using X-ray mammograms for early-stage detection of breast cancer, which can decrease the death rate to a large extent. This paper work proposes a novel 2-way threshold-based intelligent water drops IWD “algorithm for feature selection to design an effective and efficient CAD system that can detect breast cancer in early stage. This approach first extracts the local binary patterns in wavelet domain from mammograms and then applies our introduced 2-way threshold-based IWD algorithm to extract most important subset of features from the extracted features set. Two-way thresholding is a technique to find a lower bound and an upper bound on the number of features to be selected in the optimal subset. So, using these threshold values, IWD is capable of producing multiple optimal subsets of features rather than producing a single optimal subset of features. The best subset among the above subsets is then used to train and deploy support vector machine (SVM) to classify new mammograms. The results have shown that the proposed model outperforms many of the existing CAD systems. Further we have compared our introduced feature selection technique with other meta-heuristic features selection techniques such as ant colony optimization, particle swarm optimization, simulated annealing, genetic algorithm, gravitational search algorithm, inclined planes optimization and gray wolf optimization algorithm and found that it outperforms the other feature selection techniques. The accuracy, precision, recall, specificity and F1-score of our proposed framework are measured on MIAS dataset as 99%, 98.7%, 98.123%, 96.2% and 98.4%, respectively, and on DDSM dataset as 97.89%, 96.9%, 96.4%, 94.8% and 96.2%.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?