Data-efficient Active Learning for Structured Prediction with Partial Annotation and Self-Training

Zhisong Zhang,Emma Strubell,Eduard Hovy
2023-10-19
Abstract:In this work we propose a pragmatic method that reduces the annotation cost for structured label spaces using active learning. Our approach leverages partial annotation, which reduces labeling costs for structured outputs by selecting only the most informative sub-structures for annotation. We also utilize self-training to incorporate the current model's automatic predictions as pseudo-labels for un-annotated sub-structures. A key challenge in effectively combining partial annotation with self-training to reduce annotation cost is determining which sub-structures to select to label. To address this challenge, we adopt an error estimator to adaptively decide the partial selection ratio according to the current model's capability. In evaluations spanning four structured prediction tasks, we show that our combination of partial annotation and self-training using an adaptive selection ratio reduces annotation cost over strong full annotation baselines under a fair comparison scheme that takes reading time into consideration.
Computation and Language
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to reduce the annotation cost in structured prediction tasks. Specifically, the author proposes a method that combines partial annotation (PA) and self - training. Through active learning (AL), the most informative sub - structures are selected for annotation, and the automatic predictions of the model are used as pseudo - labels for unannotated sub - structures, thereby reducing the annotation workload while ensuring the performance of the model. ### Main Problems 1. **Reducing Annotation Cost**: Structured prediction tasks usually require a large amount of annotated data, which is not only time - consuming but also costly. The author hopes to reduce the required amount of annotation by means of partial annotation and self - training while ensuring the performance of the model. 2. **Selecting Appropriate Sub - Structures**: In partial annotation, how to select the most informative sub - structures is a key issue. The author proposes an adaptive selection strategy based on an error estimator, which dynamically adjusts the selection ratio according to the current ability of the model. 3. **Effectively Utilizing Unannotated Data**: Through the self - training method, the prediction results of the model for unannotated data are used as additional training signals to further improve the performance of the model. ### Method Overview - **Partial Annotation (PA)**: Select the most uncertain sub - structures in the sentence for annotation instead of all the structures of the entire sentence. This reduces the annotation workload. - **Self - Training**: Use the prediction results of the model for unannotated data as pseudo - labels to enhance the training effect of the model. - **Adaptive Selection Strategy**: Dynamically determine the selection ratio of partial annotation through an error estimator to ensure that the selected sub - structures have the highest information content. ### Experimental Setup - **Tasks**: Named Entity Recognition (NER), Dependency Parsing (DPAR), Event Extraction and Relation Extraction. - **Datasets**: CoNLL - 2003 (NER), English Web Treebank (DPAR), ACE05 (Event Extraction and Relation Extraction). - **Evaluation Metrics**: Reading cost (measured by the total number of words in the sentence) and annotation cost (measured by the number of annotated sub - structures). ### Results - **NER**: Under the same reading cost, partial annotation (PA) can achieve performance comparable to full annotation (FA), but with a smaller number of annotated sub - structures. - **DPAR**: Partial annotation (PA) also maintains performance similar to full annotation (FA) while reducing the annotation cost. - **Adaptive Selection Strategy**: The adaptive selection strategy can dynamically adjust the selection ratio according to the current ability of the model, effectively reducing unnecessary annotation. ### Conclusion This paper successfully reduces the annotation cost in multiple structured prediction tasks while maintaining the performance of the model by combining the methods of partial annotation and self - training. The adaptive selection strategy and self - training method play a key role in this.