Abstract:Data scarcity has been the main factor that hinders the progress of event extraction. To overcome this issue, we propose a Self-Training with Feedback (STF) framework that leverages the large-scale unlabeled data and acquires feedback for each new event prediction from the unlabeled data by comparing it to the Abstract Meaning Representation (AMR) graph of the same sentence. Specifically, STF consists of (1) a base event extraction model trained on existing event annotations and then applied to large-scale unlabeled corpora to predict new event mentions as pseudo training samples, and (2) a novel scoring model that takes in each new predicted event trigger, an argument, its argument role, as well as their paths in the AMR graph to estimate a compatibility score indicating the correctness of the pseudo label. The compatibility scores further act as feedback to encourage or discourage the model learning on the pseudo labels during self-training. Experimental results on three benchmark datasets, including ACE05-E, ACE05-E+, and ERE, demonstrate the effectiveness of the STF framework on event extraction, especially event argument extraction, with significant performance gain over the base event extraction models and strong baselines. Our experimental analysis further shows that STF is a generic framework as it can be applied to improve most, if not all, event extraction models by leveraging large-scale unlabeled data, even when high-quality AMR graph annotations are not available.

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper aims to address the issue of data scarcity in Event Extraction (EE). Although the development of deep learning has significantly improved the performance of event extraction, the insufficient amount of existing event annotation data remains a major constraint. For example, in the commonly used event extraction benchmark dataset ACE-05, 10 out of 33 event types have fewer than 80 annotations. However, creating high-quality event annotation data is very expensive and time-consuming. To solve this problem, the authors propose a Self-Training with Feedback (STF) framework, which leverages large-scale unlabeled data and obtains feedback for each new event prediction by comparing it with Abstract Meaning Representation (AMR) graphs. Specifically, the STF framework consists of two main parts: 1. **Base Event Extraction Model**: First, a base event extraction model is trained on the existing event annotation data, and then it is applied to a large-scale unlabeled corpus to predict new event mentions as pseudo-training samples. 2. **Scoring Model**: This model receives each new predicted event trigger, arguments, and their roles, as well as their paths in the AMR graph, and estimates a compatibility score to indicate the correctness of the pseudo-labels. These compatibility scores further serve as feedback to encourage or discourage the model's learning of pseudo-labels during the self-training process. Experimental results show that the STF framework significantly improves performance in event extraction tasks, especially in event argument extraction tasks, on three benchmark datasets (ACE05-E, ACE05-E+, and ERE), surpassing the base event extraction model and strong baseline methods. Additionally, experimental analysis indicates that STF is a general framework that can be applied to most event extraction models and can be effective even without high-quality AMR graph annotations.

Improve Event Extraction via Self-Training with Gradient Guidance

Advancing document-level event extraction: Integration across texts and reciprocal feedback

An improved RL-based framework for multiple biomedical event extraction via self-supervised learning

Learning Event Extraction from a Few Guideline Examples

Semi-Supervised Event Extraction with Paraphrase Clusters

Fine-Grained Meetup Events Extraction Through Context-Aware Event Argument Positioning and Recognition

Revisiting the Evaluation of End-to-end Event Extraction.

Boosting Event Extraction with Denoised Structure-to-Text Augmentation

Scale Up Event Extraction Learning via Automatic Training Data Generation

Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?

Enhancing Document-level Event Argument Extraction with Contextual Clues and Role Relevance.

CLEVE: Contrastive Pre-training for Event Extraction

Schema-based Data Augmentation for Event Extraction

Adaptive Self-training Framework for Fine-grained Scene Graph Generation

EventRL: Enhancing Event Extraction with Outcome Supervision for Large Language Models

TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction

A Two-Stream AMR-enhanced Model for Document-level Event Argument Extraction

Syntax-based dynamic latent graph for event relation extraction

Effective Event Extraction Method Via Enhanced Graph Convolutional Network Indication with Hierarchical Argument Selection Strategy

Beyond Exact Match: Semantically Reassessing Event Extraction by Large Language Models

DEGAP: Dual Event-Guided Adaptive Prefixes for Templated-Based Event Argument Extraction with Slot Querying