Bimodal Distribution Removal and Genetic Algorithm in Neural Network for Breast Cancer Diagnosis

Ke Quan

DOI: https://doi.org/10.48550/arXiv.2002.08729

2020-02-20

Abstract:Diagnosis of breast cancer has been well studied in the past. Multiple linear programming models have been devised to approximate the relationship between cell features and tumour malignancy. However, these models are less capable in handling non-linear correlations. Neural networks instead are powerful in processing complex non-linear correlations. It is thus certainly beneficial to approach this cancer diagnosis problem with a model based on neural network. Particularly, introducing bias to neural network training process is deemed as an important means to increase training efficiency. Out of a number of popular proposed methods for introducing artificial bias, Bimodal Distribution Removal (BDR) presents ideal efficiency improvement results and fair simplicity in implementation. However, this paper examines the effectiveness of BDR against the target cancer diagnosis classification problem and shows that BDR process in fact negatively impacts classification performance. In addition, this paper also explores genetic algorithm as an efficient tool for feature selection and produced significantly better results comparing to baseline model that without any feature selection in place

Machine Learning,Computer Vision and Pattern Recognition,Neural and Evolutionary Computing,Image and Video Processing

What problem does this paper attempt to address?

The main problems that this paper attempts to solve include: 1. **Handling Non - linear Relationships in Breast Cancer Diagnosis**: - Traditional linear programming models perform poorly when dealing with the complex non - linear relationships between cell features and tumor malignancy. Neural Networks (NN), due to their strong non - linear processing capabilities, are considered more suitable for the classification diagnosis of breast cancer. - Formula representation: Suppose \( f(x) \) is the relationship function between cell feature \( x \) and tumor malignancy. Then the linear model assumes \( f(x)=w^{T}x + b \), while the neural network can capture more complex non - linear relationships \( f(x;\theta) \), where \( \theta \) represents network parameters. 2. **Feature Selection and Noise Data Processing**: - Two techniques are introduced in the study to improve the performance of neural networks: Bimodal Distribution Removal (BDR) and Genetic Algorithm (GA). BDR aims to remove abnormal patterns during the training process, and GA is used to select the most representative features to improve the generalization ability and prediction accuracy of the model. 3. **Evaluating the Effectiveness of BDR and GA**: - The paper verifies the influence of BDR and GA on the performance of neural networks through experiments. Specifically, the researchers hope to confirm: - Whether BDR can effectively remove noise data, thereby improving the generalization ability of the model. - Whether GA can effectively screen out features that contribute to tumor malignancy, thereby improving the accuracy and training efficiency of the model. ### Summary of Main Research Contents - **Background and Motivation**: - Traditional linear programming models have limitations in dealing with breast cancer diagnosis problems, especially in handling complex non - linear relationships. Therefore, using neural networks for classification is a better choice. - **Methods**: - Use the Wisconsin Breast Cancer Diagnosis dataset for experiments. - Design a three - layer neural network and conduct comparative experiments with three models: the control group, experimental group B (applying BDR), and experimental group G (applying GA). - Compare the classification performance of different models and evaluate the effects of BDR and GA. - **Results and Discussion**: - **Effectiveness of GA**: - The experimental results show that GA can significantly improve classification accuracy, especially when the number of remaining features is between 16 and 20. In addition, GA also reduces the number of hidden - layer neurons required, from about 40 to about 28. - **Effectiveness of BDR**: - The experimental results indicate that BDR does not significantly improve classification accuracy, and in some cases, it even reduces performance. This may be because the removed patterns are not real noise data but meaningful input information. ### Conclusion This paper verifies the effectiveness of GA in feature selection through experiments, but BDR does not achieve the expected effect in removing noise data and may even mistakenly remove some useful information. Therefore, more caution is required in the application of BDR, especially when determining whether the bimodal distribution is truly caused by noise.

Bimodal Distribution Removal and Genetic Algorithm in Neural Network for Breast Cancer Diagnosis

Breast Cancer Diagnosis Using WNN Based on GA

Genetic hyperparameter optimization with Modified Scalable-Neighbourhood Component Analysis for breast cancer prognostication

Parameters Selection in Gene Selection Using Gaussian Kernel Support Vector Machines by Genetic Algorithm

Evolving convolutional neural network parameters through the genetic algorithm for the breast cancer classification problem

A Novel Breast Cancer Diagnosis Scheme With Intelligent Feature and Parameter Selections

Automatic approach for breast cancer detection based on deep belief network using histopathology images

Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: A comprehensive analysis

Survey and comparative analysis of machine learning algorithms for breast cancer diagnosis: A comprehensive review

Breast Cancer Multi-classification through Deep Neural Network and Hierarchical Classification Approach

Classification of Early Breast Cancer using Convolutional Neural Networks

Characterization of [3H]hemicholinium-3 binding associated with neuronal choline uptake sites in rat brain membranes

Enhancing brain cancer type prediction through machine learning algorithms and feature selection techniques

Applications of Machine Learning Techniques to Predict Diagnostic Breast Cancer

Breast Cancer Classification Depends on the Dynamic Dipper Throated Optimization Algorithm

An efficient feature selection and classification system for microarray cancer data using genetic algorithm and deep belief networks

Breast Cancer Diagnosis by Convolutional Neural Network and Advanced Thermal Exchange Optimization Algorithm

Multimodal adversarial representation learning for breast cancer prognosis prediction

Multi-layer perceptron classification method of medical data based on biogeography-based optimization algorithm with probability distributions

Improving the robustness and stability of a machine learning model for breast cancer prognosis through the use of multi-modal classifiers

Artificial Neural Network Based Breast Cancer Screening: A Comprehensive Review