Self-Bidirectional Decoupled Distillation for Time Series Classification

Zhiwen Xiao,Huanlai Xing,Rong Qu,Hui Li,Li Feng,Bowen Zhao,Jiayi Yang
DOI: https://doi.org/10.1109/tai.2024.3360180
2024-01-01
Abstract:Over the years, many deep learning algorithms have been developed for time series classification (TSC). A learning model’s performance usually depends on the quality of the semantic information extracted from lower and higher levels within the representation hierarchy. Efficiently promoting mutual learning between higher and lower levels is vital to enhance the model’s performance during model learning. To this end, we propose a self-bidirectional decoupled distillation (Self-BiDecKD) method for TSC. Unlike most self-distillation algorithms that usually transfer the target-class knowledge from higher to lower levels, Self-BiDecKD encourages the output of the output layer and the output of each lower-level block to form a bidirectional decoupled knowledge distillation (KD) pair. The bidirectional decoupled KD promotes mutual learning between lower- and higher-level semantic information and extracts the knowledge hidden in the target and non-target classes, helping Self-BiDecKD capture rich representations from the data. Experimental results show that compared with a number of self-distillation algorithms, Self-BiDecKD wins 35 out of 85 UCR2018 datasets and achieves the smallest AVG_rank score, namely 3.2882. In particular, compared with a non-self-distillation Baseline, Self-BiDecKD results in 58/8/19 regarding ‘win’/‘tie’/‘lose’.
What problem does this paper attempt to address?