Aspect-based Latent Factor Model by Integrating Ratings and Reviews for Recommender System

Lin Qiu,Sheng Gao,Wenlong Cheng,Jun Guo
DOI: https://doi.org/10.1016/j.knosys.2016.07.033
IF: 8.139
2016-01-01
Knowledge-Based Systems
Abstract:Recommender system has been recognized as a superior way for solving personal information overload problem. Rating, as an evaluation criteria revealing how much a customer likes a product, has been a foundation of recommender systems for a long period based on the popular latent factor models. However, review texts as the valuable user generated content have been neglected all the time. Recently, models integrating ratings and review texts as training sources have attracted a lot of attention, which may model review texts by topic model or its variants and then link latent factor vectors to topic distribution of review texts. For that, the integrated models need complicated optimization algorithms to fuse the heterogeneous sources, that may cause greater errors.In this work, we aim to propose a novel model, called Aspect-based Latent Factor Model (ALFM) to integrate ratings and review texts via latent factor model, in which by integrating rating matrix, user-review matrix and item-attribute matrix, the user latent factors and item latent factors with word latent factors can be derived. Our proposed model aggregates all review texts of the same user on the respective items and builds a user-review matrix by word frequencies. Similarly, an item's review is considered as all review texts of the same item collected from respective users. According to different information abstracted from review texts, we introduce two different kinds of item-attribute matrix to integrate the item-word frequencies and polarity scores of corresponding words. Experimental results on real-world data sets from amazon.com illustrate that our model can not only perform better than traditional models and art-of-state models on rating prediction task, but also accomplish cross-domain task through transferring word embedding.
What problem does this paper attempt to address?