Isolation Forest Based Anomaly Detection Framework on Non-IID Data

Haolong Xiang,Jiayu Wang,Kotagiri Ramamohanarao,Zoran Salcic,Wanchun Dou,Xuyun Zhang
DOI: https://doi.org/10.1109/MIS.2021.3057914
IF: 6.744
2021-01-01
IEEE Intelligent Systems
Abstract:Anomaly detection is a significant but challenging data mining task in a wide range of applications. Different domains usually use different ways to measure the characteristics of data and to define the anomaly types. As a result, it is a big challenge to develop a versatile anomaly detection framework that can be universally applied with satisfactory performance in most, if not all, applications. In this article, we propose a generic isolation forest based ensemble framework named EDBHiForest, which can be universally applied to data spaces with arbitrary distance measures. It is realized through embedding the isolation forest structure with extended distance-based hashing (EDBH), which can significantly enhance the versatility and applicability of isolation forest based anomaly detection. This framework overcomes the limitations of existing isolation forest based methods that can only be applied to datasets with a very limited range of distance measure types. Extensive experiments on various non-independent and identically distributed datasets demonstrate the effectiveness and efficiency of our approach.
What problem does this paper attempt to address?