Abstract:To disclose overlapped multiple relations from a sentence still keeps challenging. Most current works in terms of neural models inconveniently assuming that each sentence is explicitly mapped to a relation label, cannot handle multiple relations properly as the overlapped features of the relations are either ignored or very difficult to identify. To tackle with the new issue, we propose a novel approach for multi-labeled relation extraction with capsule network which acts considerably better than current convolutional or recurrent net in identifying the highly overlapped relations within an individual sentence. To better cluster the features and precisely extract the relations, we further devise attention-based routing algorithm and sliding-margin loss function, and embed them into our capsule network. The experimental results show that the proposed approach can indeed extract the highly overlapped features and achieve significant performance improvement for relation extraction comparing to the state-of-the-art works. Introduction Relation extraction plays a crucial role in many natural language processing (NLP) tasks. It aims to identify relation facts for pairs of entities in a sentence to construct triples like [Arthur Lee, place born, Memphis]. Relation extraction has received renewed interest in the neural network era, when neural models are effective to extract semantic meanings of relations. Compared with traditional approaches which focus on manually designed features, neural methods such as Convolutional Neural Network (CNN) (Liu et al. 2013; Zeng et al. 2014) and Recurrent Neural Network (RNN) (Zhang and Wang 2015; Zhou et al. 2016) have achieved significant improvement in relation classification. However, previous neural models are unlikely to scale in the scenario where a sentence has multiple relation labels and face the challenges in extracting highly overlapped and discrete relation features due to the following two drawbacks. First, one entity pair can express multiple relations in a sentence, which will confuse relation extractor seriously. For example, as in Figure 1, the entity pair [Arthur Lee, Memphis] keeps three possible relations which are place birth, ∗Corresponding authors: Weijia Jia, Hai Zhao, {jia-wj, zhaohai}@cs.sjtu.edu.cn Copyright c © 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. place death and place lived. The sentence S1 and S2 can both express two relations, and the sentence S3 represents another two relations. These sentences contain multiple kinds of relation features which are difficult to be identified clearly. The existing neural models tendentiously merge low-level semantic meanings to one high-level relation representation vector with methods such as max-pooling (Zeng et al. 2014; Zhang, Zhao, and Qin 2016) and word-level attention (Zhou et al. 2016). However, one high-level relation vector is still insufficient to express multiple relations precisely. Second, current methods are neglecting of the discretization of relation features. For instance, as shown in Figure 1, all the sentences express their relations with a few significant words (labeled italic in the figure) distributed discretely in the sentences. However, common neural methods handle sentences with fixed structures, which are difficult to gather relation features of different positions. For example, being spatially sensitive, CNNs adopt convolutional feature detectors to extract local patterns from a sliding window of vector sequences and use the max-pooling to select the prominent ones. Besides, the feature distribution of “no relation (NA, others)” in a dataset is different from that of definite relations. A sentence can be classified to “no relation” only when it does not contain any features of other relations. In this paper, to extract overlapped and discrete relation features, we propose a novel approach for multi-labeled relation extraction with an attentive capsule network. As shown in Figure 1, the relation extractor of the proposed method is constructed with three major layers that are feature extracting, feature clustering and relation predicting. The first one extracts low-level semantic meanings. The second layer clusters low-level features to high-level relation representations, and the final one predicts relation types for each relation representation. The low-level features are extracted with traditional neural models such as Bidirectional Long ShortTerm Memory (Bi-LSTM) and CNN. For the feature clustering layer, we utilize an attentive capsule network inspired by Sabour, Frosst, and Hinton (2017). Capsule (vector) is a small group of neurons used to express features. Its overall length indicates the significance of features, and the direction of a capsule suggests the specific property of the feature. The low-level semantic meanings from the first layer are embedded to amounts of low-level capsules, which will ar X iv :1 81 1. 04 35 4v 1 [ cs .C L ] 1 1 N ov 2 01 8 ID Instances Relations S1 [Arthur Lee], the leader of Love, died on Thursday in [Memphis]. person/place_death

Multi-Labeled Relation Extraction with Attentive Capsule Network.

Challenges Feature Extracting Feature Clustering Relation Predicting 1 , Overlapping Relations 2 , Discrete Features

Attention-Based Capsule Networks with Dynamic Routing for Relation Extraction

Relation Extraction with Multi-instance Multi-label Convolutional Neural Networks.

Reducing Wrong Labels for Distant Supervision Relation Extraction with Selective Capsule Network.

Clustering-Augmented Multi-instance Learning for Neural Relation Extraction

Attention As Relation: Learning Supervised Multi-head Self-Attention for Relation Extraction

Capsule Networks with Word-Attention Dynamic Routing for Cultural Relics Relation Extraction.

Attention-Based Convolutional Neural Network for Semantic Relation Extraction.

A Multi-grained Attention Network for Multi-labeled Distant Supervision Relation Extraction

Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks.

A Feature Combination-Based Graph Convolutional Neural Network Model for Relation Extraction

Dual Attention Graph Convolutional Network for Relation Extraction

Improving Distantly-Supervised Relation Extraction with Joint Label Embedding.

Extracting Entities and Relations by an End-to-End Neural Model with Multi-Layer Encoder.

Enhanced Heterogeneous Graph Attention Network with a Novel Multilabel Focal Loss for Document-Level Relation Extraction

Joint Extraction of Entities and Overlapping Relations using Position-Attentive Sequence Labeling

Hierarchical Attention Cnn And Entity-Aware For Relation Extraction

A Novel Chinese Overlapping Entity Relation Extraction Model Using Word-Label Based on Cascade Binary Tagging

Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions

Distantly Supervised Neural Network Model For Relation Extraction