Abstract:The superior performance of supervised classification methods in the information extraction (IE) area heavily relies on a large amount of gold standard data. Recent zero-shot classification methods converted the task to other NLP tasks (e.g., textual entailment) and used off-the-shelf models of these NLP tasks to directly perform inference on the test data without using a large amount of IE annotation data. A potentially valuable by-product of these methods is the large-scale silver standard data, i.e., pseudo-labeled data by the off-the-shelf models of other NLP tasks. However, there is no further investigation into the use of these data. In this paper, we propose a new framework, Clean-LaVe, which aims to utilize silver standard data to enhance the zero-shot performance. Clean-LaVe includes four phases: (1) Obtaining silver data; (2) Identifying relatively clean data from silver data; (3) Finetuning the off-the-shelf model using clean data; (4) Inference on the test data. The experimental results show that Clean-LaVe can outperform the baseline by 5% and 6% on TACRED and Wiki80 dataset in the zero-shot relation classification task, and by 3%-7% on Smile (Korean and Polish) in the zero-shot cross-lingual relation classification task, and by 8% on ACE05-E+ in the zero-shot event argument classification task. The code is share in
What problem does this paper attempt to address?
### Problems the paper attempts to solve
The paper aims to solve the key problems in zero - shot classification tasks, especially in the application of the Information Extraction (IE) field. Specifically, the paper focuses on how to use large - scale silver - standard data (i.e., data annotated by off - the - shelf models) to improve the performance of zero - shot classification tasks. Traditional methods usually rely on a large amount of gold - standard data (i.e., manually annotated data), which is often difficult to obtain in practical applications. Therefore, this research proposes a new framework - Clean - LaVe - to use silver - standard data to enhance the performance of zero - shot classification tasks.
### Main contributions
1. **Propose the Clean - LaVe framework**: This framework first detects a small part of clean data from the silver - standard data, then fine - tunes the pre - trained model using these clean data, and finally classifies the test data with the fine - tuned model.
2. **Introduce the clean data detection module**: This module improves the accuracy and diversity of data selection through Iteratively Weighted Negative Learning (IWNL) and Class - Aware Data Selector (CADS).
3. **Experimental results**: The experimental results show that Clean - LaVe significantly outperforms the baseline methods in multiple zero - shot classification tasks, including zero - shot relation extraction (TACRED and Wiki80 datasets), zero - shot cross - language relation extraction (Italian, Polish and Korean in the Smile dataset), and zero - shot event - argument classification (ACE05 - E+ dataset).
### Method overview
1. **LaVeEntail method**:
- **Label expression**: Convert relation types or argument roles into natural language templates.
- **Text entailment model inference**: Use off - the - shelf text entailment models to generate hypotheses and infer relation types or argument roles according to the entailment scores.
2. **Clean data detection**:
- **Iteratively weighted negative learning**: Dynamically adjust the weights of categories to reduce the impact of unbalanced data and improve the model's robustness to noisy data.
- **Class - aware data selector**: Consider the diversity of categories when selecting clean data to avoid over - selection of certain categories.
3. **Fine - tuning and inference**:
- Use the detected clean data to fine - tune the off - the - shelf text entailment model.
- Finally, use the fine - tuned model to classify the test data.
### Experimental setup and results
- **Datasets**: TACRED, Wiki80, Smile (multilingual), ACE05 - E+.
- **Baseline methods**: Include supervised learning methods using different noise - robust losses, semi - supervised learning methods, and existing zero - shot methods such as QA4RE and Global_Constraints.
- **Experimental results**: Clean - LaVe significantly outperforms the baseline methods on multiple datasets, especially in zero - shot relation extraction and event - argument classification tasks.
### Conclusion
The paper successfully uses silver - standard data to improve the performance of zero - shot classification tasks by proposing the Clean - LaVe framework. The experimental results verify the effectiveness of this method and provide a new direction for future research.