Dynamic recommender system : using cluster-based biases to improve the accuracy of the predictions

Modou Gueye,Talel Abdessalem,Hubert Naacke
DOI: https://doi.org/10.48550/arXiv.1212.0763
2012-12-03
Abstract:It is today accepted that matrix factorization models allow a high quality of rating prediction in recommender systems. However, a major drawback of matrix factorization is its static nature that results in a progressive declining of the accuracy of the predictions after each factorization. This is due to the fact that the new obtained ratings are not taken into account until a new factorization is computed, which can not be done very often because of the high cost of matrix factorization. In this paper, aiming at improving the accuracy of recommender systems, we propose a cluster-based matrix factorization technique that enables online integration of new ratings. Thus, we significantly enhance the obtained predictions between two matrix factorizations. We use finer-grained user biases by clustering similar items into groups, and allocating in these groups a bias to each user. The experiments we did on large datasets demonstrated the efficiency of our approach.
Machine Learning,Databases,Information Retrieval
What problem does this paper attempt to address?
This paper attempts to solve the problem of dynamism in recommendation systems, specifically the problem of gradually decreasing prediction accuracy caused by the static nature of matrix factorization models when dealing with new ratings. Although traditional Matrix Factorization (MF) techniques can provide high - quality rating predictions, the models they generate are static and cannot reflect changes in user interests in a timely manner. Whenever new rating data is generated, if the model is not recalculated, the accuracy of the prediction will gradually decrease over time. However, due to the high computational cost of matrix factorization, it is not practical to recalculate the model frequently. ### Problems and Solutions Proposed in the Paper #### Problem Description 1. **Limitations of Static Models**: Once the model of the recommendation system is generated, it will not be updated unless a new matrix factorization is carried out. This leads to a gradual decline in prediction accuracy. 2. **Integration of New Ratings**: New ratings cannot be incorporated into the model in a timely manner, affecting the real - time and accuracy of recommendations. 3. **High Computational Cost**: The cost of frequently recalculating the matrix factorization model is too high, making it difficult to achieve dynamic updates. #### Solution To improve the prediction accuracy of the recommendation system and reduce the frequency of recalculating the model, the paper proposes a Cluster - based Matrix Factorization (CBMF) technique, which integrates new rating data online by introducing local biases. The specific methods are as follows: 1. **Combining Clustering and Matrix Factorization**: - Group similar items to form multiple clusters. - Assign a local bias to each user in each cluster to capture the user's rating tendency for a specific set of items. 2. **Dynamically Adjusting Local Biases**: - When there is new rating data, only the relevant local biases need to be recalculated, rather than the entire model. - This can significantly improve the prediction accuracy and has a lower computational cost. 3. **Formula Representation**: - The formula for calculating the local bias \( b_C^u \) is: \[ b_C^u=\frac{1}{|C|} \sum_{j \in C}(r_{uj}-\mu_C), \quad \forall j \in C, \text { s.t. } r_{uj}>0 \] where \( C \) is the item cluster, \( r_{uj} \) is the rating of user \( u \) for item \( j \), and \( \mu_C \) is the average rating of all items in cluster \( C \). 4. **Prediction Formula**: - The formula for the predicted rating \( \hat{r}_{ui} \) is: \[ \hat{r}_{ui}=p_u \cdot q_i^T+\mu_{c(i)}+\delta_{c(i)}^u + b^u + b^i \] where \( c(i) \) represents the cluster to which item \( i \) belongs, \( b^i \) and \( b^u \) are the global biases of item \( i \) and user \( u \) respectively, and \( \delta_{c(i)}^u \) is the weighted bias difference of user \( u \) in cluster \( c(i) \). Through this method, the CBMF model proposed in the paper can maintain high prediction accuracy without frequently recalculating matrix factorization and effectively handle dynamically changing user rating data.