Abstract:Extracting entities and relations from text is a significant task of information extraction. Existing extraction models often straightforwardly produce their confident prediction results without any reconsideration or double-checking, resulting in avoidable mistakes and sub-optimal performance. In this paper, we propose a novel coarse-to-fine extraction framework, which first extracts high-potential relations as well as entities via knowledge distillation, and then rechecks the predictions via handcrafted natural language inference (NLI) task in a fine-grained manner. Specifically, based on the knowledge distillation mechanism, we train multiple teacher models iteratively through an adaptive loss function for making one teacher concentrate more on the data that others are incompetent for. Then, these complementary teacher models are utilized to provide valuable soft-label information for training a considerate student model, enabling it to generate reliable preliminary predictions. Further, these generated potential relations and entities are formulated as hypotheses, together with the original sentences as premises, serving as the input for an NLI model. Considering the linguistic diversity of relational expression, we automatically generate various semantic templates for hypotheses through an <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathcal{N}$</tex> -gram mining strategy. Moreover, due to the existence of multi-fact sentences, a relation-guided Gaussian attention is designed to reduce the gap between the single-relation hypothesis and the multi-relation premise. To implement efficient training, we also develop several ways to generate high-quality negative samples, which help the NLI model learn to identify errors. Experimental results show that the proposed method is effective and outperforms other strong baselines on public benchmarks.

Weakly-Supervised Named Entity Extraction Using Word Representations

Automatically Building Large-Scale Named Entity Recognition Corpora from Chinese Wikipedia

Representation Learning for Weakly Supervised Relation Extraction

Exploiting Collective Hidden Structures In Webpage Titles For Open Domain Entity Extraction

An Unsupervised Learning Approach for NER Based on Online Encyclopedia.

A Web Semantic-Based Text Analysis Approach for Enhancing Named Entity Recognition Using PU-Learning and Negative Sampling

Entity Extraction with Knowledge from Web Scale Corpora

A relation extraction method of Chinese named entities based on location and semantic features

A Weakly Supervised Chinese Named Entity Recognition Method Combining First-Order Logic

Few-shot named entity recognition with hybrid multi-prototype learning

Query-Based Named Entity Recognition

Leveraging Large Data with Weak Supervision for Joint Feature and Opinion Word Extraction

A Coarse-to-Fine Framework for Entity-Relation Joint Extraction.

Weakly-Supervised Extraction of Ontology Concept Instances and Concept Attributes from the Web

WRTRe: Weighted Relative Position Transformer for Joint Entity and Relation Extraction.

Exploiting global contextual information for document-level named entity recognition

Entity-Relation Extraction As Full Shallow Semantic Dependency Parsing

Named Entity Analysis and Extraction with Uncommon Words

Syntax-aware entity representations for neural relation extraction.

Named Entity Recognition with Bidirectional LSTM-CNNs

Less than One-shot: Named Entity Recognition via Extremely Weak Supervision