A novel oversampling and feature selection hybrid algorithm for imbalanced data classification
Fang Feng,Kuan-Ching Li,Erfu Yang,Qingguo Zhou,Lihong Han,Amir Hussain,Mingjiang Cai
DOI: https://doi.org/10.1007/s11042-022-13240-0
IF: 2.577
2022-06-24
Multimedia Tools and Applications
Abstract:Traditional approaches tend to cause classier bias in the imbalanced data set, resulting in poor classification performance for minority classes. In particular, there are many imbalanced data in financial fraud, network intrusion, and fault detection, where recognition rate of minority classes is pertinent than the classification performance of majority classes. Therefore, there is pressure on developing efficient algorithms to solve the class imbalance problem. To this end, this article presents a novel hybrid algorithm Negative Binary General (NBG), to improve the performance of imbalanced classifications by combining oversampling and a feature selection algorithm. A novel oversampling algorithm, Negative-positive Synthetic Minority Oversampling Technique (NPSMOTE), improves sample generation’s practicability while the Binary Ant Lion Optimizer (BALO) algorithm extracts the most significant features to improve the classification performance. Simulation experiments carried out using seven benchmark imbalanced data sets demonstrate that, the proposed NBG algorithm significantly outperforms the classification of imbalanced small-sample data sets compared to nine other existing and six recently published algorithms.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering