Academic Article Recommendation Using Multiple Perspectives

Kenneth Church,Omar Alonso,Peter Vickers,Jiameng Sun,Abteen Ebrahimi,Raman Chandrasekar
2024-07-08
Abstract:We argue that Content-based filtering (CBF) and Graph-based methods (GB) complement one another in Academic Search recommendations. The scientific literature can be viewed as a conversation between authors and the audience. CBF uses abstracts to infer authors' positions, and GB uses citations to infer responses from the audience. In this paper, we describe nine differences between CBF and GB, as well as synergistic opportunities for hybrid combinations. Two embeddings will be used to illustrate these opportunities: (1) Specter, a CBF method based on BERT-like deepnet encodings of abstracts, and (2) ProNE, a GB method based on spectral clustering of more than 200M papers and 2B citations from Semantic Scholar.
Information Retrieval
What problem does this paper attempt to address?
The paper attempts to address the complementarity and combination of Content-Based Filtering (CBF) and Graph-Based (GB) methods in academic search recommendation systems. Specifically, the authors believe: 1. **Content-Based Filtering (CBF)**: Infers the author's perspective through abstracts, suitable for understanding the specific content of the paper. 2. **Graph-Based (GB)**: Infers reader feedback through citation relationships, suitable for understanding the paper's impact within the academic community. The main goals of the paper are: - **Compare the differences between CBF and GB**: Compare from multiple perspectives such as input, interpretation, history, implementation details, computational bottlenecks, scale, time invariance, prior knowledge, and edge cases and missing values. - **Explore the collaborative opportunities of CBF and GB**: By combining these two methods, improve the coverage and robustness of the recommendation system. The authors used two embedding methods to illustrate these opportunities: - **Specter**: A CBF method based on BERT-like deep network encoding. - **ProNE**: A spectral clustering method based on more than 200 million papers and more than 2 billion citations. Through these methods, the paper aims to improve academic search recommendation systems to better serve authors, reviewers, and funding agencies, thereby enhancing the quality of papers and the direction of innovation in the field.