CATA++: A Collaborative Dual Attentive Autoencoder Method for Recommending Scientific Articles

Meshal Alfarhood,Jianlin Cheng
DOI: https://doi.org/10.1109/ACCESS.2020.3029722
2020-05-15
Abstract:Recommender systems today have become an essential component of any commercial website. Collaborative filtering approaches, and Matrix Factorization (MF) techniques in particular, are widely used in recommender systems. However, the natural data sparsity problem limits their performance where users generally interact with very few items in the system. Consequently, multiple hybrid models were proposed recently to optimize MF performance by incorporating additional contextual information in its learning process. Although these models improve the recommendation quality, there are two primary aspects for further improvements: (1) multiple models focus only on some portion of the available contextual information and neglect other portions; (2) learning the feature space of the side contextual information needs to be further enhanced. In this paper, we introduce a Collaborative Dual Attentive Autoencoder (CATA++) for recommending scientific articles. CATA++ utilizes an article's content and learns its latent space via two parallel autoencoders. We employ the attention mechanism to capture the most related parts of information in order to make more relevant recommendations. Extensive experiments on three real-world datasets have shown that our dual-way learning strategy has significantly improved the MF performance in comparison with other state-of-the-art MF-based models using various experimental evaluations. The source code of our methods is available at: <a class="link-external link-https" href="https://github.com/jianlin-cheng/CATA" rel="external noopener nofollow">this https URL</a>.
Machine Learning,Information Retrieval
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are the **data sparsity problem** and the **problem of insufficient use of context information** faced by existing recommendation systems when dealing with scientific article recommendations. Specifically: 1. **Data sparsity problem**: Collaborative Filtering (CF) models, especially Matrix Factorization (MF) techniques, experience a significant performance decline when the user - item interaction data is very limited. This is because most users only interact with a small number of items in the system, resulting in a very sparse data matrix. 2. **Context information under - utilization**: Although existing hybrid models optimize MF performance by introducing additional context information (such as article content, tags, etc.), they often only focus on some of the available context information and ignore the others. In addition, the feature space for learning side - context information still needs to be further enhanced. To solve these problems, the author proposes a new method - **Collaborative Dual Attentive Autoencoder (CATA++)**. The main contributions of CATA++ are as follows: - **Introduction of attention mechanism**: By integrating the attention mechanism into the deep - feature learning process, CATA++ can learn more effective feature representations from the text information of articles (such as titles, abstracts, tags, and citation relationships). This helps to improve the quality of recommendations, especially on highly sparse datasets. - **Full utilization of article content**: As far as the author knows, CATA++ is the first model to simultaneously utilize all the content of an article (including titles, abstracts, tags, and citation relationships), which is achieved by coupling two attention autoencoder networks. The latent features learned by these networks are then integrated into the matrix factorization method to generate the final recommendation results. - **Experimental verification**: The author conducted extensive experiments on three real - world datasets, and the results show that CATA++ outperforms other state - of - the - art MF - based models on multiple evaluation metrics, especially in cases where the data sparsity is extremely high. In conclusion, this paper aims to solve the problems of data sparsity and insufficient use of context information in existing scientific article recommendation systems by proposing the CATA++ model, thereby providing higher - quality recommendation services.