Prediction Model of Breast Cancer Based on mRMR Feature Selection

Junwen Di,Zhiguo Shi
DOI: https://doi.org/10.1007/978-3-030-63820-7_4
2020-01-01
Abstract:In real life, there are a lot of unbalanced data, and there are great differences in the data volume in category distribution, especially in the medical data where this problem is more prominent because of the prevalence rate. In this paper, the P-mRMR algorithm is proposed based on the mRMR algorithm to improve the feature selection process of unbalance data, and to process the attributes with more missing values and integrate the missing values into feature selection while selecting features specific to the characteristics of more missing values in the data set, so as to reduce the complexity of the data pre-processing. In the experiments, the AUC, confusion matrix and probability of missing value are used to compare the algorithms. The experiment shows that the features selected by the improved algorithm have better results in the classifiers.
What problem does this paper attempt to address?