Anomaly Detection for Time Series Data Stream

Qifan Wang,Bo Yan,Hongyi Su,Hong Zheng
DOI: https://doi.org/10.1109/icbda51983.2021.9402957
2021-01-01
Abstract:Time Series is an important data object, which has the characteristics of high dimensionality, large amount of data, and fast data update. In the field of anomaly detection problems, there are problems of data skew and few abnormal data samples, which makes it difficult to train traditional supervised learning models. At the same time, with the rise of the Internet of Things, more and more data exists in the form of streams. In response to the above problems, this paper proposes a anomaly detection method for time series data stream. This method first uses multiple random convolution kernels to perform feature transformation on the time series, and then inputs the obtained feature map into RRCF (Robust random cut forest), and finally scores the samples according to the characteristics of the RRCF, and the ones that exceed the threshold are considered abnormal. This method does not need pre training model for real-time detection of time series data stream, but dynamic maintenance model, so it does not need manual label and has low cost. The experimental results show that the method in this paper has good performance on different data sets. Finally, the algorithm is implemented on the Apache Flink platform, which greatly improves the throughput of the detection system and enables the system to process massive data.
What problem does this paper attempt to address?