Abstract:The task of outlier detection aims at mining abnormal objects that deviate from normal distribution. Traditional unsupervised outlier detection methods can detect most global outliers, but only perform well under relatively single data distribution. Although the methods based on k-nearest neighbors can fit more complex data distribution, they also have the problem of hardly detecting local outliers or the performance easily influenced by data manifolds. At the same time, the outlier detection performance of most methods based on k-nearest neighbors is greatly affected by parameter k. We proposed a robust outlier detection method based on the changing rate of directed density ratio. The local density of samples is calculated by combining kernel density estimation and extended neighbor set which contains k-nearest neighbors and reverse k-nearest neighbors. Then we define the directed density ratio of a sample based on the density ratio and the vector between the sample and its neighbors. The local information can be better estimated by directed density ratio under different local densities and data manifolds. Then, by increasing the size of neighbors, the change of directed density ratio of a sample was calculated and finally summed up as the outlier score. Experiments are carried out on 12 synthetic datasets that simulate different data distributions and 22 public datasets. The experimental results show that compared with several state-of-the-art methods, the proposed method can achieve better outlier detection performance under different data distributions. In addition, the proposed method shows better robustness when parameter k changes in experimental results.

Efficient Outlier Detection for High-Dimensional Data

A High-Dimensional Outlier Detection Algorithm Base on Relevant Subspace.

Outlier Detection Using Local Density and Global Structure

An effective and efficient algorithm for high-dimensional outlier detection

An Outlier Detection Algorithm based on Local Density and Natural Neighbors

Outlier detection algorithm based on k-nearest neighbors-local outlier factor

Outlier detection for high dimensional data

Outlier detection method based on high-density iteration

A Novel Density-Based Outlier Detection Approach for Low Density Datasets

A neighborhood weighted-based method for the detection of outliers

A Method for Measurement Data Modeling and High-Dimensional Outlier Detection Based on Large Dimensional Matrix

Projected outlier detection in high-dimensional mixed-attributes data set

Local Dynamic Neighborhood Based Outlier Detection Approach and Its Framework for Large-Scale Datasets

Outlier Detection Based on Eigenspace Subtracting

An Efficient Density-Based Local Outlier Detection Approach for Scattered Data.

Locally linear embedding method for high dimensional data outlier detection

Finding Centric Local Outliers in Categorical/numerical Spaces.

Outlier Detection Using Structural Scores in a High-Dimensional Space.

An Outlier Detection Algorithm Based on Cross-Correlation Analysis for Time Series Dataset

An Efficient Algorithm for Distributed Outlier Detection in Large Multi-Dimensional Datasets

Robust Outlier Detection Based on the Changing Rate of Directed Density Ratio