Cleanits-MEDetect: Multiple Errors Detection for Time Series Data in Cleanits.

Xiaoou Ding,Yichen Song,Hongzhi Wang,Donghua Yang,Yida Liu
DOI: https://doi.org/10.1007/978-3-031-30678-5_54
2023-01-01
Abstract:Data quality problems are seriously prevalent in time series data, and the data suffer from types of errors including single-point errors, continuous errors, and contextual errors. Since it is challenging to achieve high accuracy and efficiency in error detection tasks for time series data, we develop error detection system MEDetect in Cleanits, a data cleaning tool for multi-dimensional industrial time series. We propose an integration detection model for multiple errors, which holds the hierarchical variational automatic encoder as the main structure, and we propose a dimensionality reduction method for k-shape based cluster- ing algorithm, which reduces the time costs of the detection process. MEDetect is designed to allow customized error detection, and users can choose detection and repairing algorithms on their demands.
What problem does this paper attempt to address?