Abstract:The variety of complex algorithmic approaches for tackling time-series classification problems has grown considerably over the past decades, including the development of sophisticated but challenging-to-interpret deep-learning-based methods. But without comparison to simpler methods it can be difficult to determine when such complexity is required to obtain strong performance on a given problem. Here we evaluate the performance of an extremely simple classification approach -- a linear classifier in the space of two simple features that ignore the sequential ordering of the data: the mean and standard deviation of time-series values. Across a large repository of 128 univariate time-series classification problems, this simple distributional moment-based approach outperformed chance on 69 problems, and reached 100% accuracy on two problems. With a neuroimaging time-series case study, we find that a simple linear model based on the mean and standard deviation performs better at classifying individuals with schizophrenia than a model that additionally includes features of the time-series dynamics. Comparing the performance of simple distributional features of a time series provides important context for interpreting the performance of complex time-series classification models, which may not always be required to obtain high accuracy.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper explores the issue of time series classification and attempts to address the following problems: 1. **Effectiveness of Simple Methods**: The study finds that in many cases, simple linear classifiers (based on the mean and standard deviation of the time series) perform quite well in time series classification tasks, even outperforming complex deep learning methods. This suggests that in some cases, complex methods may not be necessary. 2. **Importance of Benchmarking**: The paper emphasizes the importance of using simple benchmark methods when evaluating time series classification algorithms to better understand whether the performance improvements brought by complex models are truly necessary. By comparing with simple distribution features, the actual contribution of complex models can be better assessed. 3. **Performance on Specific Datasets**: The paper specifically analyzes a time series classification task in neuroimaging, namely distinguishing between schizophrenia patients and healthy controls using resting-state functional magnetic resonance imaging (rs-fMRI) data. The results show that a simple model using only the mean and standard deviation as features performs excellently, even better than complex models that include more dynamic features. 4. **Importance of Normalization**: The paper also discusses the impact of time series normalization (such as z-score transformation) on classification results. If all time series are normalized, classifiers based on the mean and standard deviation will not work. Therefore, normalization is crucial for fair comparison of different methods. Through these studies, the paper emphasizes the need to carefully consider the effectiveness of simple distribution features and the interpretability of models when developing and interpreting time series classification models, as well as how to choose appropriate features to avoid overfitting issues.

Never a Dull Moment: Distributional Properties as a Baseline for Time-Series Classification

Multilevel Dynamic Time Warping: A Parameter-Light Method for Fast Time Series Classification

Piecewise Factorization for Time Series Classification.

A New Distributional Treatment for Time Series and an Anomaly Detection Investigation.

A Comparative Study on Time Series Classification

Highly Comparative Feature-Based Time-Series Classification

Piecewise Chebyshev Factorization Based Nearest Neighbour Classification for Time Series

Out-of-Distribution Representation Learning for Time Series Classification

A new distributional treatment for time series anomaly detection

Multivariate Time Series Early Classification Across Channel and Time Dimensions

Exploiting Multi-Channels Deep Convolutional Neural Networks for Multivariate Time Series Classification

A Latent Source Model for Nonparametric Time Series Classification

A simple but tough-to-beat baseline for fMRI time-series classification

Reducing statistical time-series problems to binary classification

Time Series Classification by Modeling the Principal Shapes.

An Overview On Feature-Based Classification Algorithms For Multivariate Time Series

Nearest Subspace with Discriminative Regularization for Time Series Classification

Classification of High-dimensional Time Series in Spectral Domain using Explainable Features

Pseudo Bidirectional Linear Discriminant Analysis For Multivariate Time Series Classification

Research on a dynamic full Bayesian classifier for time-series data with insufficient information

Temporal Streaming Batch Principal Component Analysis for Time Series Classification