Abstract:Imbalanced problems have recently attracted much attention due to their prevalence in numerous domains of great importance to the data mining community. However, conventional bi-class classification approaches, e.g., Support vector machine (SVM), generally perform poorly on imbalanced datasets as they are originally designed to generalize from the training data, and pay little attention to the minority class. In the paper, we extend traditional support vector domain description (SVDD) and propose a novel density-sensitive SVDD classifier based on maximum soft margin (DSMSM-SVDD) for imbalanced datasets. In the proposed approach, the relative density-based penalty weights are incorporated into the optimization objective function to represent the importance of the data samples. Through optimizing the objective function with the relative density-based penalty weights, the training majority samples with high relative densities are more likely to lie inside the hypersphere, thus eliminating noise effects on traditional SVDD. In addition, to make full use of the minority class samples to refine the boundary in training, the maximum soft margin regularization term is also introduced in the proposed technique inspired by the idea of maximizing soft margin of traditional SVM. This method allows the optimal domain description boundary to more skew toward the minority class than traditional SVDD and thus improves the classification accuracy. Eventually, AdaBoost ensemble version of DSMSM-SVDD is developed so as to further improve the generalization performance and stability in dealing with imbalanced datasets. The extensive experimental results on various datasets demonstrate that the proposed approach significantly outperforms other existing algorithms when dealing with the imbalanced classification problems in terms of G-Mean, F-Measure and AUC performance measures.

Self-adaptive cost weights-based support vector machine cost-sensitive ensemble for imbalanced data classification

Imbalanced Data Sets Classification Method Based on Over-Sampling Technique

Imbalanced Data Classification Algorithm Based on Integrated Sampling and Ensemble Learning.

Support Vector Machines Ensemble With Optimizing Weights By Genetic Algorithm

Weighted Support Vector Machine for Classification with Uneven Training Class Sizes

Hybrid SVM algorithm oriented to classifying imbalanced datasets

Improved SVM algorithm for imbalanced dataset classification

The Ensemble of Density-Sensitive SVDD Classifier Based on Maximum Soft Margin for Imbalanced Datasets.

Adaptive Sampling With Optimal Cost For Class-Imbalance Learning

Adaptive Subspace Optimization Ensemble Method for High-Dimensional Imbalanced Data Classification

A weighted hybrid ensemble method for classifying imbalanced data

Adaptive ensemble of classifiers with regularization for imbalanced data classification

An Adaptive Cost-Sensitive Learning and Recursive Denoising Framework for Imbalanced SVM Classification

Self-paced Ensemble for Highly Imbalanced Massive Data Classification

Adaptive Fusion Based Method for Imbalanced Data Classification

An ensemble imbalanced classification method based on model dynamic selection driven by data partition hybrid sampling

A Novel Imbalanced Data Classification Method Based on Weakly Supervised Learning for Fault Diagnosis

Adaptive Weight Optimization for Classification of Imbalanced Data.

Classification of Imbalanced Credit scoring data sets Based on Ensemble Method with the Weighted-Hybrid-Sampling

A Statistical Approach to Cost-Sensitive AdaBoost for Imbalanced Data Classification

A Classfication Method For Imbalance Data Set Based on Kernel SMOTE