Abstract:Time Series Classification (TSC) is essential in fields like medicine, environmental science, and finance, enabling tasks such as disease diagnosis, anomaly detection, and stock price analysis. While machine learning models like Recurrent Neural Networks and InceptionTime are successful in numerous applications, they can face scalability issues due to computational requirements. Recently, ROCKET has emerged as an efficient alternative, achieving state-of-the-art performance and simplifying training by utilizing a large number of randomly generated features from the time series data. However, many of these features are redundant or non-informative, increasing computational load and compromising generalization. Here we introduce Sequential Feature Detachment (SFD) to identify and prune non-essential features in ROCKET-based models, such as ROCKET, MiniRocket, and MultiRocket. SFD estimates feature importance using model coefficients and can handle large feature sets without complex hyperparameter tuning. Testing on the UCR archive shows that SFD can produce models with better test accuracy using only 10\% of the original features. We named these pruned models Detach-ROCKET. We also present an end-to-end procedure for determining an optimal balance between the number of features and model accuracy. On the largest binary UCR dataset, Detach-ROCKET improves test accuracy by 0.6\% while reducing features by 98.9\%. By enabling a significant reduction in model size without sacrificing accuracy, our methodology improves computational efficiency and contributes to model interpretability. We believe that Detach-ROCKET will be a valuable tool for researchers and practitioners working with time series data, who can find a user-friendly implementation of the model at \url{<a class="link-external link-https" href="https://github.com/gon-uri/detach_rocket" rel="external noopener nofollow">this https URL</a>}.

What problem does this paper attempt to address?

The paper aims to address the issues of computational inefficiency and insufficient model generalization caused by feature redundancy in Time Series Classification (TSC). Specifically, although existing machine learning models such as Recurrent Neural Networks (RNNs) and InceptionTime have achieved success in various applications, they face scalability issues on large-scale datasets due to their high computational demands. To tackle these problems, the paper introduces a method called Sequential Feature Detachment (SFD) to identify and prune non-essential features in ROCKET models (such as ROCKET, MiniRocket, and MultiRocket). SFD evaluates the importance of features by estimating model coefficients and can handle large feature sets without complex hyperparameter tuning. Experimental results show that SFD can improve test accuracy while retaining only 10% of the original features, significantly reducing model size without sacrificing accuracy. This method not only enhances computational efficiency but also improves model interpretability. Additionally, the paper introduces an end-to-end process to determine the optimal balance between the number of features and model accuracy to further optimize model performance.

Detach-ROCKET: Sequential feature selection for time series classification with random convolutional kernels

Detach-ROCKET: sequential feature selection for time series classification with random convolutional kernels

Classification of Raw MEG/EEG Data with Detach-Rocket Ensemble: An Improved ROCKET Algorithm for Multivariate Time Series Analysis

POCKET: Pruning Random Convolution Kernels for Time Series Classification from a Feature Selection Perspective

MultiRocket: multiple pooling operators and transformations for fast and effective time series classification

MiniRocket: A Very Fast (Almost) Deterministic Transform for Time Series Classification

Time series classification with random convolution kernels based transforms: pooling operators and input representations matter

ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels

HDC-MiniROCKET: Explicit Time Encoding in Time Series Classification with Hyperdimensional Computing

Integrating Data-Driven Segmentation, Local Feature Extraction and Fisher Kernel Encoding to Improve Time Series Classification

Prognostic classification based on random convolutional kernel

The great multivariate time series classification bake off: a review and experimental evaluation of recent algorithmic advances

Neural fingerprinting on MEG time series using MiniRocket

Fast, Accurate and Interpretable Time Series Classification Through Randomization

ECRTime: Ensemble Integration of Classification and Retrieval for Time Series Classification

Fast, accurate and explainable time series classification through randomization

Minirocket Kullanarak Güçlendirilmiş ve Verimli Atriyal Fibrilasyon Tespiti

TSec: an Efficient and Effective Framework for Time Series Classification

Automatic Feature Engineering for Time Series Classification: Evaluation and Discussion

Data Augmentation for Multivariate Time Series Classification: An Experimental Study

Random Convolution Kernels with Multi-Scale Decomposition for Preterm EEG Inter-burst Detection