A hybrid ensemble and evolutionary algorithm for imbalanced classification and its application on bioinformatics
Yongqing Zhang,Meng Lin,Yihan Yang,Chunli Ding
DOI: https://doi.org/10.1016/j.compbiolchem.2022.107646
IF: 3.737
2022-06-01
Computational Biology and Chemistry
Abstract:Imbalanced data classification is the fundamental problem of data mining. Relevant researchers have proposed many solutions to solve the problem, such as sampling and ensemble learning methods. However, random under-sampling is easy to lose representative samples, and ensemble learning does not use the correlation information between pieces in the data set. Therefore, we proposed a Hybrid Adaptive sampling with Bagging Classifier(HABC). Specifically, we calculated the adaptive sampling rate according to the characteristics of the data set. We then performed density-based under-sampling and over-sampling on the original data set according to the sampling rate. Further, the sampled data subset was sent to the Bagging classifier, and the classifier was employed to predict the unknown data set. In addition, the multi-objective particle swarm optimization algorithm was combined to optimize the prediction result. Extensive experiments based on UCI, KEEL, and three bioinformatics datasets show that our proposed method is better than state-of-the-art algorithms.
biology,computer science, interdisciplinary applications