A Fast Outlier Detection Method for Big Data.

Boyuan Liu,Wenhui Fan,Tianyuan Xiao
DOI: https://doi.org/10.1007/978-3-642-45037-2_38
2013-01-01
Abstract:Outlier in simulation can help people to know the defect of simulation system. With the rapid expansion of data scale, conventional outlier detection methods begin to have trouble dealing with large datasets. In this paper, we propose an Entropy based Fast Detection (EFD) algorithm which incorporates the new ideas in handling big data. The algorithm takes the information entropy measure as the core, with attribute frequency value as the auxiliary. By means of rapid computation of decreased entropy, the outliers can be got quickly. The results show that EFD algorithm can detect the outliers in high efficiency without obvious loss of accuracy.
What problem does this paper attempt to address?