RelevAI-Reviewer: A Benchmark on AI Reviewers for Survey Paper Relevance

Paulo Henrique Couto,Quang Phuoc Ho,Nageeta Kumari,Benedictus Kent Rachmat,Thanh Gia Hieu Khuong,Ihsan Ullah,Lisheng Sun-Hosoya
2024-06-13
Abstract:Recent advancements in Artificial Intelligence (AI), particularly the widespread adoption of Large Language Models (LLMs), have significantly enhanced text analysis capabilities. This technological evolution offers considerable promise for automating the review of scientific papers, a task traditionally managed through peer review by fellow researchers. Despite its critical role in maintaining research quality, the conventional peer-review process is often slow and subject to biases, potentially impeding the swift propagation of scientific knowledge. In this paper, we propose RelevAI-Reviewer, an automatic system that conceptualizes the task of survey paper review as a classification problem, aimed at assessing the relevance of a paper in relation to a specified prompt, analogous to a "call for papers". To address this, we introduce a novel dataset comprised of 25,164 instances. Each instance contains one prompt and four candidate papers, each varying in relevance to the prompt. The objective is to develop a machine learning (ML) model capable of determining the relevance of each paper and identifying the most pertinent one. We explore various baseline approaches, including traditional ML classifiers like Support Vector Machine (SVM) and advanced language models such as BERT. Preliminary findings indicate that the BERT-based end-to-end classifier surpasses other conventional ML methods in performance. We present this problem as a public challenge to foster engagement and interest in this area of research.
Computation and Language,Machine Learning
What problem does this paper attempt to address?
The paper aims to address the challenges present in the scientific paper review process, particularly the issues of slow speed and potential bias in traditional peer review. The authors propose an automated system called RelevAI-Reviewer, which treats the paper review task as a classification problem to evaluate the relevance of a paper to a specific prompt (similar to a "call for papers"). Specifically, the goal of the system is to use machine learning models to determine the relevance of candidate papers and identify the most relevant ones. To achieve this goal, the researchers constructed a new dataset containing 25,164 instances, each including a prompt and four candidate papers with varying degrees of relevance. They also explored various baseline methods, including traditional machine learning classifiers such as Support Vector Machines (SVM) and advanced language models like BERT. The study found that the end-to-end classifier based on BERT outperformed other traditional machine learning methods in terms of performance. Additionally, the paper presents this problem as an open challenge to foster research interest and technological development in this field. By doing so, the research team hopes to attract more researchers to participate and collectively advance the automation and intelligence of the scientific paper review process.