Abstract:Recognizing relations between two words is a fundamental task with the broad applications. Different from extracting relations from text, it is difficult to identify relations among words without their contexts. Especially for long-tail relations, it becomes more difficult due to inadequate semantic features. Existing approaches based on language models (LMs) utilize rich knowledge of LMs to enhance the semantic features of relations. However, they capture uncommon relations while overlooking less frequent but meaningful ones since knowledge of LMs seriously relies on trained data where often represents common relations. On the other hand, long-tail relations are often uncommon in training data. It is interesting but not trivial to use external knowledge to enrich LMs due to collecting corpus containing long-tail relationships is hardly feasible. In this paper, we propose a sememe knowledge enhanced method (SememeLM) to enhance the representation of long-tail relations, in which sememes can break the contextual constraints between wors. Firstly, we present a sememe relation graph and propose a graph encoding method. Moreover, since external knowledge base possibly consisting of massive irrelevant knowledge, the noise is introduced. We propose a consistency alignment module, which aligns the introduced knowledge with LMs, reduces the noise and integrates the knowledge into the language model. Finally, we conducted experiments on word analogy datasets, which evaluates the ability to distinguish relation representations subtle differences, including long-tail relations. Extensive experiments show that our approach outperforms some state-of-the-art methods.

Improved Word Representation Learning with Sememes

Improving Sequence Modeling Ability of Recurrent Neural Networks via Sememes

SememeLM: A Sememe Knowledge Enhanced Method for Long-tail Relation Representation

Lexical Sememe Prediction Via Word Embeddings and Matrix Factorization.

Improved Learning of Chinese Word Embeddings with Semantic Knowledge.

Incorporating Chinese Characters of Words for Lexical Sememe Prediction

Chinese Word Sense Embedding with SememeWSD and Synonym Set

Modeling Semantic Compositionality With Sememe Knowledge

Sememe Prediction: Learning Semantic Knowledge from Unstructured Textual Wiki Descriptions.

Cross-lingual Lexical Sememe Prediction

SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge

Chinese LIWC Lexicon Expansion Via Hierarchical Classification of Word Embeddings with Sememe Attention

Sememe Knowledge Computation: a Review of Recent Advances in Application and Expansion of Sememe Knowledge Bases

Predicting Categorial Sememe for English-Chinese Word Pairs via Representations in Explainable Sememe Space.

Leveraging Human Prior Knowledge to Learn Sense Representations

Enhancing Semantic Word Representations by Embedding Deeper Word Relationships

Incorporating Sememes into Chinese Definition Modeling

Semantic Representations of Word Senses and Concepts

A Unified Model for Word Sense Representation and Disambiguation.

Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models

Sememe Prediction for BabelNet Synsets Using Multilingual and Multimodal Information