Model Selection with a Shapelet-based Distance Measure for Multi-source Transfer Learning in Time Series Classification

Jiseok Lee,Brian Kenji Iwana
2024-09-30
Abstract:Transfer learning is a common practice that alleviates the need for extensive data to train neural networks. It is performed by pre-training a model using a source dataset and fine-tuning it for a target task. However, not every source dataset is appropriate for each target dataset, especially for time series. In this paper, we propose a novel method of selecting and using multiple datasets for transfer learning for time series classification. Specifically, our method combines multiple datasets as one source dataset for pre-training neural networks. Furthermore, for selecting multiple sources, our method measures the transferability of datasets based on shapelet discovery for effective source selection. While traditional transferability measures require considerable time for pre-training all the possible sources for source selection of each possible architecture, our method can be repeatedly used for every possible architecture with a single simple computation. Using the proposed method, we demonstrate that it is possible to increase the performance of temporal convolutional neural networks (CNN) on time series datasets.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to effectively select multiple source datasets for transfer learning in time - series classification. Specifically, the paper focuses on how to reduce the need for a large amount of labeled data and improve the performance of the model simultaneously by pre - training neural networks in time - series tasks. However, not all source datasets are suitable for every target dataset. Therefore, selecting appropriate source datasets is crucial for the effectiveness of transfer learning. The paper proposes a method based on shapelet distance measurement to select and use multiple datasets for transfer learning. This method measures the transferability between datasets by discovering shapelets, thereby achieving effective source - dataset selection. Compared with traditional transferability measurement methods, this method can quickly and effectively select appropriate source datasets without pre - training all possible source datasets. ### Main contributions: 1. **Propose a new source - dataset selection method**: This method uses the shapelet similarity between the extracted target dataset and possible source datasets to predict and select source datasets. 2. **Combine multi - source datasets for pre - training**: A method of combining multiple source datasets into a super - dataset for pre - training neural networks is proposed. 3. **Extensive experimental verification**: The proposed method is evaluated on 128 time - series datasets in the 2018 UCR time - series archive. 4. **Provide open - source code**: To facilitate the use by other researchers, the author provides easy - to - use transfer - learning code. ### Problems solved: - **Data insufficiency problem**: Through transfer learning, a large number of source datasets can be used to pre - train the model, thereby reducing the need for labeled data in the target dataset. - **Source - dataset selection problem**: A method based on shapelet similarity is proposed to select appropriate source datasets, avoiding performance degradation caused by using inappropriate source datasets. - **Computational efficiency problem**: Compared with traditional transferability measurement methods, this method is more computationally efficient and does not require pre - training of all possible source datasets. Through these contributions, the paper aims to improve the performance and efficiency of transfer learning in time - series classification tasks.