CALDA: Improving Multi-Source Time Series Domain Adaptation with Contrastive Adversarial Learning

Garrett Wilson,Janardhan Rao Doppa,Diane J. Cook
2023-07-22
Abstract:Unsupervised domain adaptation (UDA) provides a strategy for improving machine learning performance in data-rich (target) domains where ground truth labels are inaccessible but can be found in related (source) domains. In cases where meta-domain information such as label distributions is available, weak supervision can further boost performance. We propose a novel framework, CALDA, to tackle these two problems. CALDA synergistically combines the principles of contrastive learning and adversarial learning to robustly support multi-source UDA (MS-UDA) for time series data. Similar to prior methods, CALDA utilizes adversarial learning to align source and target feature representations. Unlike prior approaches, CALDA additionally leverages cross-source label information across domains. CALDA pulls examples with the same label close to each other, while pushing apart examples with different labels, reshaping the space through contrastive learning. Unlike prior contrastive adaptation methods, CALDA requires neither data augmentation nor pseudo labeling, which may be more challenging for time series. We empirically validate our proposed approach. Based on results from human activity recognition, electromyography, and synthetic datasets, we find utilizing cross-source information improves performance over prior time series and contrastive methods. Weak supervision further improves performance, even in the presence of noise, allowing CALDA to offer generalizable strategies for MS-UDA. Code is available at: <a class="link-external link-https" href="https://github.com/floft/calda" rel="external noopener nofollow">this https URL</a>
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in multi - source time - series unsupervised domain adaptation (MS - UDA), how to improve the machine - learning performance of the target domain, especially when there are no real labels in the target domain. Specifically, the authors propose a new framework CALDA (Contrastive Adversarial Learning for Multi - Source Time Series Domain Adaptation), aiming to improve the domain - adaptation problem of multi - source time - series data by combining contrastive learning and adversarial learning methods and using cross - source label information. ### Main Problem Description 1. **Challenges of Unsupervised Domain Adaptation (UDA)**: - In the absence of real labels in the target domain, how to use the labeled data from multiple source domains to improve the model performance on the target domain. - The domain - adaptation problem of multi - source time - series data is particularly complex because there may be large differences in the feature distributions among different source domains. 2. **Limitations of Existing Methods**: - Existing UDA methods usually only deal with single - source domain adaptation and rarely consider time - series data. - Some methods rely on data augmentation or pseudo - labels, which may not work well on time - series data. 3. **Introducing Weakly - Supervised Information**: - When meta - domain information (such as label distribution) exists, how to use this information to further improve the model performance. ### Goals of the CALDA Framework The main goals of the CALDA framework are: - **Align Feature Representations of Source and Target Domains**: Make the feature representations of the source and target domains as consistent as possible through adversarial learning. - **Utilize Cross - Source Label Information**: Pull samples with the same label closer and push samples with different labels apart through contrastive learning, so as to make better use of the source - domain label information. - **Avoid Data Augmentation and Pseudo - Labels**: Different from contrastive learning methods in the image field, CALDA does not require data augmentation or pseudo - labels and is more suitable for time - series data. - **Integrate Weakly - Supervised Information**: When meta - domain information is available, use this information to further improve the model performance. ### Specific Contributions 1. **Improved Time - Series Multi - Source UDA**: Use cross - source label information through contrastive learning without the need for data augmentation or pseudo - labels. 2. **Multiple Contrastive Learning Strategies**: Analyze the impact of different design choices on performance. 3. **Utilize Class - Distribution Information**: Use class - distribution information through a weakly - supervised mechanism. 4. **Performance Verification**: Verify the effectiveness of CALDA through synthetic datasets and actual time - series datasets (such as human - activity - recognition and electromyogram data), and show its performance improvement with or without weak supervision. ### Summary This paper solves the key problems in multi - source time - series unsupervised domain adaptation by proposing the CALDA framework, especially how to effectively use source - domain label information and meta - domain information to improve model performance when the target domain lacks labels.