Abstract:Outlier detection is of vital importance in data mining tasks, with numerous applications, including video surveillance and credit card fraud detection. Quite a few outlier detection algorithms have been developed and have received considerable attention, and most existing methods are classified as distance-based algorithms and density-based algorithms. However, both of these approaches have some flaws. The former has difficulty detecting local outliers, and the latter cannot handle low-density pattern problems. Moreover, outlier detection algorithms are sensitive to parameter settings. This paper proposes a simple and efficient outlier detection approach (called ADD) based on the average divergence difference of data objects; in this method there is no need to artificially define the number of neighbors of objects k to solve the above issues. In this algorithm, two new measures, called the divergence factor (DF) and the average divergence difference (LADD), are developed based on the skewed distribution characteristics of data objects and their natural neighbors, thus improving the accuracy of local outlier detection from an innovative research perspective. These factors are presented as external and internal characterization factors because the former characterizes the skew distribution characteristics and compactness relationship of data objects and the latter represents the difference in the skew distribution characteristics of data objects in a neighborhood. Then, we set an appropriate threshold to distinguish whether a data point is an outlier, which eliminates the interference of the Top-N problem. Finally, the final experimental results show that the ADD algorithm achieves an overall improvement in local outlier detection, especially in the detection of outliers in some datasets with complex distributions and in low-density areas, compared to that achieved by state-of-the-art algorithms.

A fast MST-inspired kNN-based outlier detection method

A neighborhood weighted-based method for the detection of outliers

Outlier Detection Algorithm Based on Reachable Neighbor

Outlier detection algorithm based on k-nearest neighbors-local outlier factor

A New Outlier Detection Algorithm Based on Fast Density Peak Clustering Outlier Factor

Robust Multi-Kernel Nearest Neighborhood for Outlier Detection

Intelligent Identification and Order-Sensitive Correction Method of Outliers from Multi-Data Source Based on Historical Data Mining

MSD-Kmeans: A Novel Algorithm for Efficient Detection of Global and Local Outliers

Comparative Study of Neighbor-based Methods for Local Outlier Detection

NDOD: an Efficient Neighboring Dependent Outlier Detector for Bias Distributed Large Datasets

A local search algorithm for k-means with outliers

ADD: a new average divergence difference-based outlier detection method with skewed distribution of data objects

MSD-Kmeans: A Hybrid Algorithm for Efficient Detection of Global and Local Outliers

An Outlier Detection Method Based On Symmetry and Curvature Threshold

A non‐parametric statistical test method to detect significant cross‐outliers in spatial points

Detecting outliers by clustering algorithms

Outlier detection method based on high-density iteration

MS2OD: Outlier Detection Using Minimum Spanning Tree and Medoid Selection

Entropy-based Outlier Detection Using Spark

A method for outlier detection based on cluster analysis and visual expert criteria

A Genetic Algorithm Based Technique for Outlier Detection with Fast Convergence.