Survey on Distributed Word Embeddings Based on Neural Network Language Models

YU Ke-ren,FU Yun-bin,DONG Qi-wen
DOI: https://doi.org/10.3969/j.issn.1000-5641.2017.05.006
2017-01-01
Abstract:Distributed word embedding is one of the most important research topics in the field of Natural Language Processing,whose core idea is using lower dimensional vectors to represent words in text.There are many ways to generate such vectors,among which the methods based on neural network language models perform best.And the respective case is Word2vec,which is an open source tool developed by Google inc.in 2012.Distributed word embeddings can be used to solve many Natural Language Processing tasks such as text clusting,named entity tagging,part of speech analysing and so on.Distributed word embeddings rely heavily on the performance of the neural network language model it based on and the specific task it processes.This paper gives an overview of the distributed word embeddings based on neural network and can be summarized from three aspects,including the construction of classical neural network language models,the optimization method for multi-classification problem in language model,and how to use auxiliary structure to train word embeddings.
What problem does this paper attempt to address?