An End-to-End Model for Time Series Classification In the Presence of Missing Values

Pengshuai Yao,Mengna Liu,Xu Cheng,Fan Shi,Huan Li,Xiufeng Liu,Shengyong Chen
2024-08-12
Abstract:Time series classification with missing data is a prevalent issue in time series analysis, as temporal data often contain missing values in practical applications. The traditional two-stage approach, which handles imputation and classification separately, can result in sub-optimal performance as label information is not utilized in the imputation process. On the other hand, a one-stage approach can learn features under missing information, but feature representation is limited as imputed errors are propagated in the classification process. To overcome these challenges, this study proposes an end-to-end neural network that unifies data imputation and representation learning within a single framework, allowing the imputation process to take advantage of label information. Differing from previous methods, our approach places less emphasis on the accuracy of imputation data and instead prioritizes classification performance. A specifically designed multi-scale feature learning module is implemented to extract useful information from the noise-imputation data. The proposed model is evaluated on 68 univariate time series datasets from the UCR archive, as well as a multivariate time series dataset with various missing data ratios and 4 real-world datasets with missing information. The results indicate that the proposed model outperforms state-of-the-art approaches for incomplete time series classification, particularly in scenarios with high levels of missing data.
Machine Learning
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper primarily addresses the issue of missing values in time series classification (Incomplete Time Series Classification, ITSC). Specifically: 1. **Limitations of existing methods**: - **Two-stage methods**: Traditional two-stage methods perform data imputation and classification separately. These methods do not utilize label information during the imputation process, leading to suboptimal classification performance. - **Single-stage methods**: Although single-stage methods combine imputation and classification, their feature representation is limited because imputation errors accumulate during the classification process. 2. **Proposed method**: - The paper proposes an end-to-end neural network model that unifies data imputation and feature learning within a single framework, allowing the imputation process to leverage label information. - Compared to previous two-stage methods, this approach focuses more on classification performance rather than the accuracy of data imputation. - A multi-scale feature learning module is designed to extract useful information from the "imputation noise" data. 3. **Experimental validation**: - The model was evaluated on 68 univariate time series datasets from the UCR archive and tested on several real-world datasets with different missing rates. - Results show that the model outperforms existing state-of-the-art methods in scenarios with high missing rates. In this way, the paper aims to improve the performance of time series classification in the presence of a significant amount of missing values.