Outlier Detection Based on Weighted Neighbourhood Information Network for Mixed-Valued Datasets.

Yu Wang,Yupeng Li
DOI: https://doi.org/10.1016/j.ins.2021.02.045
IF: 8.1
2021-01-01
Information Sciences
Abstract:Outlier detection is of great importance in industry as unexpected errors or faults, abnormal behaviours or phenomena, etc. can occur due to a variety of human, system, and environmental reasons. To identify and analyse these rare items, events or observations can find either anomalies or novelties and, as a result, can help avoid potential unexpected consequences or improve industrial system performance. The operating data collected from industrial systems in the Industry 4.0 era are characterized as multi-attribute (e.g., both numerical and categorical) compared to previous studies. Therefore, a new outlier detection method for mixed-valued datasets based on the weighted network model is proposed in this paper. Concretely, a weighted neighbourhood information network (WNIN) is constructed by considering the neighbourhood relations and similarities among objects to represent a dataset with mixed-valued attributes (DMA). A tailored Markov random walk method is employed to detect outlier on the predefined network model. After reaching the equilibrium, the inlier score is defined according to the out-degree of nodes in the WNIN to represent the inlier degree of objects. Experiments on two real datasets and a case study illustrate the effectiveness and adaptability of the proposed method.
What problem does this paper attempt to address?