A General Framework For Text Semantic Analysis And Clustering On Yelp Reviews

Renfeng Jiang,Yimin Liu,Ke Xu Mentor,Bryan McCann
2015-01-01
Abstract:Millions of user reviews have been posted through Yelp. Automatic extraction of useful information from these reviews can be very beneficial for both users and businesses. Recent success in understanding the meaning of a word within the context of natural language processing (NLP) has shed a light on such a practice. Word2vec, an implementation of neural network based wordembedding approaches, has shown its ability to accurately capture the semantic similarity among words. The transition from word2vec to doc2vec (document to vector) or text2vec (text to vector), however, has remained an active research. In this study, a word2vec based framework for learning Yelp reviews to yield vector/matrix representation of Yelp reviews and Yelp businesses has been developed. It's application in automatic recognition of similarity among different reviews or different businesses has been shown to be successful. Furthermore, the framework is shown to be able to handle practical tasks including businesses recommendation, businesses clustering and reviews clustering.
What problem does this paper attempt to address?