Review of Classification Methods on Unbalanced Data Sets

Le Wang,Meng Han,Xiaojuan Li,Ni Zhang,Haodong Cheng
DOI: https://doi.org/10.1109/access.2021.3074243
IF: 3.9
2021-01-01
IEEE Access
Abstract:This paper studies the classification of unbalanced data sets. First, this kind of data sets is briefly introduced, and then the classification methods of unbalanced data sets are analyzed in detail from different perspectives such as data sampling method, algorithm level, feature level, cost-sensitive function, and deep learning. In addition, the data sampling methods are divided into different technologies for introduction: unbalanced data set classification method based on synthetic minority over-sampling technology (SMOTE), support vector machine (SVM) technology, and k-nearest neighbor (KNN) technology, etc. Then, the advantages and disadvantages of these methods are compared. Finally, the evaluation criteria of the unbalanced data set classifier are summarized, and the future work directions are prospected and summarized.
computer science, information systems,telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?