Robust Outlier Detection Based on the Changing Rate of Directed Density Ratio

Kangsheng Li,Xin Gao,Shiyuan Fu,Xinping Diao,Ping Ye,Bing Xue,Jiahao Yu,Zijian Huang
DOI: https://doi.org/10.1016/j.eswa.2022.117988
IF: 8.5
2022-01-01
Expert Systems with Applications
Abstract:The task of outlier detection aims at mining abnormal objects that deviate from normal distribution. Traditional unsupervised outlier detection methods can detect most global outliers, but only perform well under relatively single data distribution. Although the methods based on k-nearest neighbors can fit more complex data distribution, they also have the problem of hardly detecting local outliers or the performance easily influenced by data manifolds. At the same time, the outlier detection performance of most methods based on k-nearest neighbors is greatly affected by parameter k. We proposed a robust outlier detection method based on the changing rate of directed density ratio. The local density of samples is calculated by combining kernel density estimation and extended neighbor set which contains k-nearest neighbors and reverse k-nearest neighbors. Then we define the directed density ratio of a sample based on the density ratio and the vector between the sample and its neighbors. The local information can be better estimated by directed density ratio under different local densities and data manifolds. Then, by increasing the size of neighbors, the change of directed density ratio of a sample was calculated and finally summed up as the outlier score. Experiments are carried out on 12 synthetic datasets that simulate different data distributions and 22 public datasets. The experimental results show that compared with several state-of-the-art methods, the proposed method can achieve better outlier detection performance under different data distributions. In addition, the proposed method shows better robustness when parameter k changes in experimental results.
What problem does this paper attempt to address?