Abstract:Document-level Relation Extraction (RE) is a promising task aiming at identifying relations of multiple entity pairs in a document. Compared with the sentence-level counterpart, it has raised two significant challenges: a) In most cases, a relational fact can be adequately expressed via a small subset of sentences from the document, namely evidence. But the traditional method cannot model such strong semantic correlations between evidence sentences that collaborate to describe a specific relation; b) The data of this task is extremely long-tail in terms of too many NA instances and imbalanced relational types. Such data can mislead the tail prediction bias to the head categories in the RE model. In this paper, we present a novel E vidence reasoning and C urriculum learning method for D oc RE (DRE-EC) to address these challenges. Particularly, we first formulate evidence extraction as a sequential decision problem through a crafted reinforcement learning mechanism with an efficient path searching strategy to reduce the action space. Providing the evidence for each entity pair as a customized-filtered document in advance helps infer the relations better. To address the long-tail issue, we further develop a hybrid curriculum learning method at the NA-level (NC) and relation-level (RC) with our customized difficulty measure score. In NC, the NA samples are scheduled in an easy-to-hard scheme and gradually added, resulting in the data distribution from ideal and balanced to real and unbalanced. In RC, the scheme is switched into hard-to-easy to enhance the hard and tail samples. In addition, we propose a new Equalization adaptive Focal Loss(EFLoss) that can adjust to the changing data distribution and focus more on the tail categories. We conduct various experiments on two document-level RE benchmarks and achieve a remarkable improvement over previous competitive baselines. Furthermore, we provide detailed analyses of the advantages and effectiveness of our method.

An End-to-end Joint Model for Evidence Information Extraction from Court Record Document.

Evidence Sentence Extraction for Machine Reading Comprehension

Establish Evidence Chain Model on Chinese Criminal Judgment Documents Using Text Similarity Measure.

Evidence-aware Document-level Relation Extraction

Ensemble Methods for Word Embedding Model Based on Judicial Text.

Scientific Discourse Tagging for Evidence Extraction

Build Evidence Chain Relational Model Based on Chinese Judgment Documents.

A Joint Entity and Relation Extraction Model based on Efficient Sampling and Explicit Interaction

CJE-PCHF: Chinese Joint Entity and Relation Extraction Model Based on Progressive Contrastive Learning and Heterogeneous Feature Fusion

BERT-CNN based evidence retrieval and aggregation for Chinese legal multi-choice question answering

Enhancing cross-evidence reasoning graph for document-level relation extraction

Mining heuristic evidence sentences for more interpretable document-level relation extraction

Modeling Instance Interactions for Joint Information Extraction with Neural High-Order Conditional Random Field

A tag based joint extraction model for Chinese medical text

Learning with Joint Cross-Document Information Via Multi-Task Learning for Named Entity Recognition

Evidence Reasoning and Curriculum Learning for Document-Level Relation Extraction

A Cascade Dual-Decoder Model for Joint Entity and Relation Extraction

Various Legal Factors Extraction Based on Machine Reading Comprehension.

DELTA: Pre-train a Discriminative Encoder for Legal Case Retrieval via Structural Word Alignment

Data-efficient End-to-end Information Extraction for Statistical Legal Analysis

A Joint Learning Information Extraction Method Based on an Effective Inference Structure