Abstract:Distributed representations of words (aka word embedding) have proven helpful in solving natural language processing (NLP) tasks. Training distributed representations of words with neural networks has lately been a major focus of researchers in the field. Recent work on word embedding, the Continuous Bag-of-Words (CBOW) model and the Continuous Skip-gram (Skip-gram) model, have produced particularly impressive results, significantly speeding up the training process to enable word representation learning from largescale data. However, both CBOW and Skip-gram do not pay enough attention to word proximity in terms of model or word ambiguity in terms of linguistics. In this paper, we propose Proximity-Ambiguity Sensitive (PAS) models (i.e. PAS CBOW and PAS Skip-gram) to produce high quality distributed representations of words considering both word proximity and ambiguity. From the model perspective, we introduce proximity weights as parameters to be learned in PAS CBOWand used in PAS Skip-gram. By better modeling word proximity, we reveal the strength of pooling-structured neural networks in word representation learning. The proximitysensitive pooling layer can also be applied to other neural network applications that employ pooling layers. From the linguistics perspective, we train multiple representation vectors per word. Each representation vector corresponds to a particular group of POS tags of the word. By using PAS models, we achieved a 16.9% increase in accuracy over state-of-theart models.

Learning word representation by jointly using neighbor and syntactic contexts

Using Context-to-Vector with Graph Retrofitting to Improve Word Embeddings

Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations.

Learning Context-Specific Word/Character Embeddings.

A Unified Framework for Jointly Learning Distributed Representations of Word and Attributes.

Geometric Relationship Between Word and Context Representations

Co-learning of Word Representations and Morpheme Representations.

Learning Word Representation Considering Proximity and Ambiguity

Inside Out: Two Jointly Predictive Models For Word Representations And Phrase Representations

Learning Context-Sensitive Word Embeddings with Neural Tensor Skip-Gram Model

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Context-Specific and Multi-Prototype Character Representations.

Distilling Monolingual and Crosslingual Word-in-Context Representations

A Probabilistic Model for Learning Multi-Prototype Word Embeddings.

Incorporating Linguistic Knowledge for Learning Distributed Word Representations.

A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings

Enhancing Unsupervised Semantic Parsing with Distributed Contextual Representations

Learning word embeddings from dependency relations

Leveraging Diverse Modeling Contexts with Collaborating Learning for Neural Machine Translation

Constructing Word-Context-Coupled Space Aligned with Associative Knowledge Relations for Interpretable Language Modeling

Learning Topic-Sensitive Word Representations