Multivariate outlier detection based on self-organizing map and adaptive nonlinear map and its application

Xuefeng Yan
DOI: https://doi.org/10.1016/j.chemolab.2011.04.007
IF: 4.175
2011-01-01
Chemometrics and Intelligent Laboratory Systems
Abstract:To facilitate visualizing and detecting outliers in high dimensional complex data, a novel method integrating self-organizing map (SOM) with adaptive non-linear map (ANLM) was proposed for multivariate outlier detection. Firstly, the high dimensional complex data are pre-processed by robust scaling. Secondly, SOM is applied to map the pre-processed data onto the SOM plane to obtain the topology of the high dimensional complex data, and then the 2-dimensional projection plane of the trained SOM plane, on which the data distribution can be visualized easily, is obtained via ANLM. In sequel, based on the 2-dimensional plane and the topology, a quasi-3 δ edit rule was proposed to distinguish between the normal data and the outliers in high dimensional complex data. Finally, the proposed multivariate outlier detection was illustrated using synthetic data, two standard benchmark data sets and a real industrial process data. The empirical results show that the outliers in high dimensional complex data are visualized easily on the 2-dimensional plane and effectively detected and eliminated by the quasi-3 δ edit rule, and fully demonstrate its satisfactory ability on dealing with outliers in high dimensional complex data.
What problem does this paper attempt to address?