Learning From a Complementary-Label Source Domain: Theory and Algorithms

Yiyang Zhang,Feng Liu,Zhen Fang,Bo Yuan,Guangquan Zhang,Jie Lu
DOI: https://doi.org/10.1109/tnnls.2021.3086093
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:In unsupervised domain adaptation (UDA), a classifier for the target domain is trained with massive true-label data from the source domain and unlabeled data from the target domain. However, collecting true-label data in the source domain can be expensive and sometimes impractical. Compared to the true label (TL), a complementary label (CL) specifies a class that a pattern does not belong to, and hence, collecting CLs would be less laborious than collecting TLs. In this article, we propose a novel setting where the source domain is composed of complementary-label data, and a theoretical bound of this setting is provided. We consider two cases of this setting: one is that the source domain only contains complementary-label data [completely complementary UDA (CC-UDA)] and the other is that the source domain has plenty of complementary-label data and a small amount of true-label data [partly complementary UDA (PC-UDA)]. To this end, a complementary label adversarial network (CLARINET) is proposed to solve CC-UDA and PC-UDA problems. CLARINET maintains two deep networks simultaneously, with one focusing on classifying the complementary-label source data and the other taking care of the source-to-target distributional adaptation. Experiments show that CLARINET significantly outperforms a series of competent baselines on handwritten digit-recognition and object-recognition tasks.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to use the complementary label data (CLs) in the source domain to train the target - domain classifier in unsupervised domain adaptation (UDA), so as to reduce the dependence on expensive true label data (TLs). Specifically, the paper proposes two scenarios: 1. **Completely Complementary UDA (CC - UDA)**: There are only complementary label data in the source domain. 2. **Partly Complementary UDA (PC - UDA)**: There are a large amount of complementary label data and a small amount of true label data in the source domain. ### Background and Motivation In traditional UDA methods, a large amount of true label data is usually required to train the source - domain classifier, which is very expensive and sometimes difficult to achieve in practical applications. In contrast, it is much easier and less costly to collect complementary label data (that is, to specify which class a pattern does not belong to). Therefore, the paper proposes a new setting, that is, using complementary label data for UDA to reduce the annotation cost. ### Main Contributions 1. **Problem Definition**: - Defined two problem settings, CC - UDA and PC - UDA. - Provided the theoretical risk bounds (Learning Bound) under these settings. 2. **Method**: - Proposed a one - step solution named CLARINET (Complementary Label Adversarial Network), which processes complementary label data and the distribution adaptation from the source domain to the target domain by training two deep networks simultaneously. - One network in CLARINET focuses on classifying complementary label data, and the other network is responsible for the distribution alignment between the source domain and the target domain. 3. **Experimental Verification**: - Conducted experiments on multiple handwritten digit recognition and object recognition tasks, and the results show that CLARINET significantly outperforms a series of baseline methods. - The experiments also verified that when there is a small amount of true label data in the source domain, the performance of CLARINET will be further improved. ### Theoretical Analysis The paper provided a theoretical risk bound on complementary label UDA, proving that under the unbiased assumption, it is possible to effectively learn from complementary label data and transfer its knowledge to the target domain. Specifically, the paper derived the following formula: \[ L_t(F \circ G) \leq L_s(F \circ G) + \epsilon + d_{\mathcal{L}}^{\mathcal{F}, G}(\otimes F \# P_{X_s}, \otimes F \# P_{X_t}) \] where: - \( L_t(F \circ G) \) is the risk on the target domain. - \( L_s(F \circ G) \) is the risk on the source domain. - \( \epsilon \) is the minimization of the empirical risk on the source domain and the target domain. - \( d_{\mathcal{L}}^{\mathcal{F}, G}(\otimes F \# P_{X_s}, \otimes F \# P_{X_t}) \) is the tensor discrepancy distance, which is used to measure the distribution difference between the source domain and the target domain. ### Experimental Results The experimental results show that CLARINET performs excellently on multiple complementary label UDA tasks, especially in handwritten digit recognition and object recognition tasks, significantly outperforming the existing baseline methods. In addition, when there is a small amount of true label data in the source domain, the performance of CLARINET is further improved, verifying the effectiveness of combining true labels and complementary label data. ### Summary This paper solves the problem of expensive true label data in UDA by introducing complementary label data, proposes an effective one - step solution CLARINET, and provides a theoretical