SepLL: Separating Latent Class Labels from Weak Supervision Noise

Andreas Stephan,Vasiliki Kougia,Benjamin Roth
DOI: https://doi.org/10.48550/arXiv.2210.13898
2022-10-25
Abstract:In the weakly supervised learning paradigm, labeling functions automatically assign heuristic, often noisy, labels to data samples. In this work, we provide a method for learning from weak labels by separating two types of complementary information associated with the labeling functions: information related to the target label and information specific to one labeling function only. Both types of information are reflected to different degrees by all labeled instances. In contrast to previous works that aimed at correcting or removing wrongly labeled instances, we learn a branched deep model that uses all data as-is, but splits the labeling function information in the latent space. Specifically, we propose the end-to-end model SepLL which extends a transformer classifier by introducing a latent space for labeling function specific and task-specific information. The learning signal is only given by the labeling functions matches, no pre-processing or label model is required for our method. Notably, the task prediction is made from the latent layer without any direct task signal. Experiments on Wrench text classification tasks show that our model is competitive with the state-of-the-art, and yields a new best average performance.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to separate the potential class - label information related to the target task from noisy labels in the weakly - supervised learning paradigm. Specifically, the authors propose a method called SepLL (Separating Latent Class Labels from Weak Supervision Noise). By dividing the information provided by the annotation functions into two categories: information related to the target task and information only related to specific annotation functions, the performance of the model in the weakly - supervised learning environment is improved. ### Problem Background In weakly - supervised learning, annotation functions automatically assign heuristic and usually noisy labels to data samples. These noisy labels may lead to biases or errors during model training. Existing methods usually attempt to correct or remove mis - annotated instances, but these methods may require complex pre - processing or label models and may not make full use of all data. ### Core Contributions of the Paper 1. **Introducing New Intuition**: The authors believe that each annotation function not only provides information related to the target task but also contains information specific to the annotation function itself. They propose to consider these two types of information as complementary and design a model that can separate these two types of information. 2. **Proposing the SepLL Model**: SepLL is an end - to - end model that extends the Transformer classifier by introducing a latent space to represent annotation - function - specific and task - specific information. The model separates the annotation - function information through the latent layers of two branches and recombines it to predict the occurrence of the annotation function. 3. **Experimental Verification**: The authors conducted experiments on the Wrench text - classification task. The results show that the SepLL model outperforms existing methods on multiple tasks and reaches a new best level in terms of average performance. ### Formula Presentation To better understand the working principle of SepLL, here are some key formulas: - **Latent Representation Transformation**: \[ z = h(x)\in\mathbb{R}^d \] where \(h\) is a pre - trained Transformer encoder and \(x\) is the input text. - **Task - Specific Path and Annotation - Function - Specific Path**: \[ \tilde{f}_Y(z)\in\mathbb{R}^{|Y|} \] \[ \tilde{f}_L(z)\in\mathbb{R}^m \] which represent the transformations of task - specific and annotation - function - specific information respectively. - **Combining Latent Layers**: \[ \hat{f}_L(z)=\tilde{f}_Y(z)T^{\top}+\tilde{f}_L(z)\in\mathbb{R}^m \] where \(T\) is a mapping matrix used to map task information to the corresponding annotation - function information. - **Cross - Entropy Loss**: \[ CE(P, Q)=\frac{1}{n}\sum_{i = 1}^{n}\sum_{j = 1}^{m}P_{ij}\log(Q_{ij}) \] where \(P\) is the normalized annotation - function - matching distribution and \(Q\) is the probability distribution predicted by the model. - **Task Prediction**: \[ P(y_i|x)=\frac{\exp(\tilde{f}_Y(z)_i)}{\sum_{j = 1}^c\exp(\tilde{f}_Y(z)_j)} \] ### Summary The main contribution of this paper is to provide a novel method to deal with the problem of noisy labels in weakly - supervised learning. By separating different parts of the annotation - function information, the model can make more effective use of all data without complex pre - processing. The experimental results show that the SepLL model performs excellently on multiple benchmark tasks and has important research and application values.