Impact of Dimensionality Reduction on Outlier Detection: an Empirical Study

Vivek Vaidya,Jaideep Vaidya
DOI: https://doi.org/10.1109/tps-isa56441.2022.00028
Abstract:Outlier detection is a fundamental data analytics technique often used for many security applications. Numerous outlier detection techniques exist, and in most cases are used to directly identify outliers without any interaction. Typically the underlying data used is often high dimensional and complex. Even though outliers may be identified, since humans can easily grasp low dimensional spaces, it is difficult for a security expert to understand/visualize why a particular event or record has been identified as an outlier. In this paper we study the extent to which outlier detection techniques work in smaller dimensions and how well dimensional reduction techniques still enable accurate detection of outliers. This can help us to understand the extent to which data can be visualized while still retaining the intrinsic outlyingness of the outliers.
What problem does this paper attempt to address?