Optimizing the classification accuracy of imbalanced dataset based on SVM

Sheng Zhang,Xiuyu Shang,Wei Wang,XiuLi Huang
DOI: https://doi.org/10.1109/ICCASM.2010.5620370
2010-01-01
Abstract:A dataset can be called imbalanced if at least one class of the data is represented by significantly less number of samples than the others. Imbalanced data generally exists in the real world. The classification performance of traditional machine learning algorithm is hampered in the classification tasks of imbalanced dataset. Support Vector Machines (SVM) is a new kind of machine learning method based on structural risk minimization principle and has had the best performance so far in several challenging applications. This paper summarizes the applications of SVM in imbalance dataset first and then presents some main improved methods which greatly improved the performance of classification in imbalanced dataset.
What problem does this paper attempt to address?