Abstract:Outlier detection is of vital importance in data mining tasks, with numerous applications, including video surveillance and credit card fraud detection. Quite a few outlier detection algorithms have been developed and have received considerable attention, and most existing methods are classified as distance-based algorithms and density-based algorithms. However, both of these approaches have some flaws. The former has difficulty detecting local outliers, and the latter cannot handle low-density pattern problems. Moreover, outlier detection algorithms are sensitive to parameter settings. This paper proposes a simple and efficient outlier detection approach (called ADD) based on the average divergence difference of data objects; in this method there is no need to artificially define the number of neighbors of objects k to solve the above issues. In this algorithm, two new measures, called the divergence factor (DF) and the average divergence difference (LADD), are developed based on the skewed distribution characteristics of data objects and their natural neighbors, thus improving the accuracy of local outlier detection from an innovative research perspective. These factors are presented as external and internal characterization factors because the former characterizes the skew distribution characteristics and compactness relationship of data objects and the latter represents the difference in the skew distribution characteristics of data objects in a neighborhood. Then, we set an appropriate threshold to distinguish whether a data point is an outlier, which eliminates the interference of the Top-N problem. Finally, the final experimental results show that the ADD algorithm achieves an overall improvement in local outlier detection, especially in the detection of outliers in some datasets with complex distributions and in low-density areas, compared to that achieved by state-of-the-art algorithms.

A double-weighted outlier detection algorithm considering the neighborhood orientation distribution of data objects

A neighborhood weighted-based method for the detection of outliers

Sample Weighting: an Inherent Approach for Outlier Suppressing Discriminant Analysis

Outlier Detection Algorithm Based on Reachable Neighbor

ADD: a new average divergence difference-based outlier detection method with skewed distribution of data objects

Rare Category Detection Algorithm Based on Weighted Boundary Degree

NDOD: an Efficient Neighboring Dependent Outlier Detector for Bias Distributed Large Datasets

Outlier detection method based on high-density iteration

LDBOD: A novel local distribution based outlier detector

DWOF: A Robust Density-Based Outlier Detection Approach

Detecting outliers by clustering algorithms

SDROF: outlier detection algorithm based on relative skewness density ratio outlier factor

Continuous Angle-based Outlier Detection on High-dimensional Data Streams.

WeightLOFCC: A Heuristic Weight-Setting Strategy of LOF Applied to Outlier Detection in Time Series Data

Dimensionality-Aware Outlier Detection: Theoretical and Experimental Analysis

Efficient Outlier Detection Algorithm Based on Support Vector Data Description

Improved Method for Noise Detection by DBSCAN and Angle Based Outlier Factor in High Dimensional Datasets

A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor

Comparative Study of Neighbor-based Methods for Local Outlier Detection

Outlier detection algorithm based on k-nearest neighbors-local outlier factor

A Robust and Efficient Boundary Point Detection Method by Measuring Local Direction Dispersion