A High-Dimensional Outlier Detection Algorithm Base on Relevant Subspace.

Zhipeng Gao,Yang Zhao,Kun Niu,Yidan Fan
DOI: https://doi.org/10.1109/DASC-PICom-DataCom-CyberSciTec.2017.165
2017-01-01
Abstract:Outlier detection in high-dimensional big data is an important data mining task to distinguish outliers from regular objects. In tradition, outlier detection approaches miss outliers which hide in full data space. However, these methods are deteriorated due to the notorious curse of dimensionality which leads to distance cannot express the deviation of outlier and normal objects, and the exponential computation leads to low efficiency. In this paper, we propose an outlier detection method based on relevant subspace, which can effectively describe the local distribution of objects and detect outliers hidden in subspaces of the data. In thorough experiments on synthetic data and real data, it shows that the method outperforms competing outlier ranking approaches by detecting outliers in subspace.
What problem does this paper attempt to address?