Adaptively weighted three-way decision oversampling: A cluster imbalanced-ratio based approach
Xinli Wang,Juan Gong,Yan Song,Jianhua Hu
DOI: https://doi.org/10.1007/s10489-022-03394-7
IF: 5.3
2022-04-15
Applied Intelligence
Abstract:Oversampling is an effective method to fulfill imbalanced learning, owing to its easy-to-go capability of achieving the balance by synthesizing new samples. However, precise synthesizing in oversampling is always a significant yet challenging task due primarily to various problems such as noise samples, within-class imbalance, and selection of boundary samples. In order to solve these problems, this paper proposes a new improved oversampling method, called adaptively weighted three-way decision oversampling (AWTDO) for imbalanced learning. The working principle of the proposed AWTDO method includes three main steps. Firstly, remove the noise sample roughly, implement K-means clustering algorithm on raw data to establish multi-clusters, and calculate imbalanced ratio of each cluster. Secondly, classify all clusters into three categories according to their imbalanced ratios and three-way decision, such as positive domain, boundary domain, and negative domain. Accordingly, assign the number of synthetic samples distinguishably to each cluster regarding its category. Thirdly, determinatively select the target minority sample in each cluster and generate the new synthetic samples by using the stochastic linear interpolation technique according to different sampling weight. Finally, some comparative experiments on public datasets have shown that the proposed AWTDO method outperforms nine state-of-the-art oversampling methods.
computer science, artificial intelligence