Abstract:The internal structural information of words has proven to be very effective for learning Chinese word embeddings. However, most previous attempts made a single form extraction of internal feature to learn representations, ignoring the comprehensive combination of such information. And they focused only on explicit feature of internal structures, even though these structures still have the implicit semantics of words. In this paper, we propose Radical and Stroke-enhanced Word Embeddings (RSWE), a novel method based on neural networks for learning Chinese word embeddings with joint guidance from semantic and morphological internal information. RSWE enables an embedding model to learn simultaneously from (1) implicit semantic information that is exploited from the radicals, and (2) stroke n-grams information that can be explicitly obtained from Chinese words. In the learning process, RSWE uses stroke n-grams to capture the local structural feature of words, and integrates the implicit information exploited from radicals to enhance the semantic of embeddings. Through this combination procedure, semantics of Chinese words are effectively transferred into the learned embeddings. We evaluate the effectiveness of RSWE on word similarity computation, word analogy reasoning, performance over dimensions, performance over learning corpus size, and named entity recognition tasks, the experimental results show that our model outperforms existing state-of-the-art approaches.

Implanting Rational Knowledge into Distributed Representation at Morpheme Level.

Co-learning of Word Representations and Morpheme Representations.

Learning Distributed Representations Of Uyghur Words And Morphemes

Learning Effective Word Embedding Using Morphological Word Similarity

Knowledge Graph Embedding with Diversity of Structures

KNET: A General Framework for Learning Word Embedding using Morphological Knowledge

Towards a Description of Chinese Morpheme Conceptions and Semantic Composition of Word

Joint Semantic Synthesis and Morphological Analysis of the Derived Word

Improved Learning of Chinese Word Embeddings with Semantic Knowledge.

Knowledge-Powered Deep Learning for Word Embedding

Incorporating Linguistic Knowledge for Learning Distributed Word Representations.

Knowledge representation of non-literal meanings of Chinese words and its applications

A morphology-based Chinese word segmentation method

Building a bilingual WordNet-like lexicon: the new approach and algorithms

Inside Out: Two Jointly Predictive Models For Word Representations And Phrase Representations

Chinese Embedding via Stroke and Glyph Information: A Dual-channel View

Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules

Multi-phase Word Sense Embedding Learning Using a Corpus and a Lexical Ontology.

Chinese word segmentation as morpheme-based lexical chunking

A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing.

Radical and Stroke-Enhanced Chinese Word Embeddings Based on Neural Networks