A Novel Svm Modeling Approach For Highly Imbalanced And Overlapping Classification

Yu Qu,Hongye Su,Lichao Guo,Jian Chu
DOI: https://doi.org/10.3233/IDA-2010-0470
IF: 1.7
2011-01-01
Intelligent Data Analysis
Abstract:Traditional classification algorithms can be limited in their performance on highly imbalanced and overlapping data sets, In this paper, we focus on modifying support vector machines (SVMs) to make it suitable for highly imbalanced and overlapping (HIO) classification. Based on the analysis of most SVM learning algorithms for imbalanced classification, we argue that in SVM-based algorithms, due to the linearity property of SVM, the key problem is that the increase of the number of correctly predicted minority samples will lead to even more majority samples be misclassified. Then a novel algorithm HIO-SVM is developed, it can recognize all minority samples while minimizing the error rate of majority ones. The proposed approach can identify the non-overlapping samples in one feature space, furthermore, by iteratively shifting kernel spaces, all non-overlapping samples in different kernel spaces are recognized. Because of the highly imbalanced distribution, the remaining overlapping samples can be regarded as minority. Then all minority samples can be predicted correctly and the error rate of majority samples can be guaranteed minimized simultaneously. Finally, numerous case studies show the properties and effectiveness of the proposed HIO-SVM algorithm.
What problem does this paper attempt to address?