EXCEEDS: Extracting Complex Events as Connecting the Dots to Graphs in Scientific Domain

Yi-Fan Lu,Xian-Ling Mao,Bo Wang,Xiao Liu,Heyan Huang
2024-06-20
Abstract:It is crucial to utilize events to understand a specific domain. There are lots of research on event extraction in many domains such as news, finance and biology domain. However, scientific domain still lacks event extraction research, including comprehensive datasets and corresponding methods. Compared to other domains, scientific domain presents two characteristics: denser nuggets and more complex events. To solve the above problem, considering these two characteristics, we first construct SciEvents, a large-scale multi-event document-level dataset with a schema tailored for scientific domain. It has 2,508 documents and 24,381 events under refined annotation and quality control. Then, we propose EXCEEDS, a novel end-to-end scientific event extraction framework by storing dense nuggets in a grid matrix and simplifying complex event extraction into a dot construction and connection task. Experimental results demonstrate state-of-the-art performances of EXCEEDS on SciEvents. Additionally, we release SciEvents and EXCEEDS on GitHub.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issue of insufficient research on Event Extraction (EE) in the scientific domain. Specifically, compared to fields such as news, finance, and biology, event extraction research in the scientific domain still lacks comprehensive datasets and corresponding methods. Scientific literature is characterized by high information density and complex events. To address these issues, the main contributions of the paper are as follows: 1. **Constructing the SciEvents Dataset**: To fill the gap in event extraction datasets in the scientific domain, the authors constructed a large-scale multi-event document-level dataset called SciEvents. This dataset contains 2,508 meticulously annotated documents and 24,381 events, with specially designed event schemas tailored for the scientific domain. 2. **Proposing a New Method EXCEEDS**: To tackle the problem of dense and complex events in the scientific domain, the authors proposed a new method named EXCEEDS. This method addresses these challenges by storing dense information in a grid matrix and simplifying the complex event extraction task into point construction and connection tasks. Specifically, EXCEEDS can encode the relationships between all word pairs and decode all events at once during inference. 3. **Defining New Evaluation Metrics**: New event extraction tasks were defined on SciEvents, and additional metrics were introduced to evaluate the model's ability to extract hierarchical events. Experimental results show that EXCEEDS performs excellently across all tasks, especially in hierarchical event extraction. In summary, the main goal of this paper is to enhance the research level and practical application of event extraction in the scientific domain by constructing the SciEvents dataset specifically for scientific events and proposing the new event extraction framework EXCEEDS.