Exploring the applicability of Large Language Models to citation context analysis

Kai Nishikawa,Hitoshi Koshiba

DOI: https://doi.org/10.1007/s11192-024-05142-9

2024-09-10

Abstract:Unlike traditional citation analysis -- which assumes that all citations in a paper are equivalent -- citation context analysis considers the contextual information of individual citations. However, citation context analysis requires creating large amounts of data through annotation, which hinders the widespread use of this methodology. This study explored the applicability of Large Language Models (LLMs) -- particularly ChatGPT -- to citation context analysis by comparing LLMs and human annotation results. The results show that the LLMs annotation is as good as or better than the human annotation in terms of consistency but poor in terms of predictive performance. Thus, having LLMs immediately replace human annotators in citation context analysis is inappropriate. However, the annotation results obtained by LLMs can be used as reference information when narrowing the annotation results obtained by multiple human annotators to one, or LLMs can be used as one of the annotators when it is difficult to prepare sufficient human annotators. This study provides basic findings important for the future development of citation context analyses.

Digital Libraries

What problem does this paper attempt to address?

The paper aims to explore the applicability of large language models (LLM) in citation context analysis. Specifically, the researchers evaluated the performance of LLMs (particularly ChatGPT) in citation context analysis by comparing them with human-annotated results. The main objectives include: 1. **Can LLMs replace humans in citation context analysis annotation?** - The study found that LLMs are comparable to or even better than humans in terms of consistency (i.e., the consistency of annotation results), but they perform poorly in predictive performance. Therefore, it is currently not suitable to completely replace human annotators with LLMs. 2. **How to effectively utilize LLMs in citation context analysis?** - Although the performance of LLMs is not satisfactory, their annotation results can be used as reference information when merging the annotation results of multiple annotators. Additionally, when it is difficult to obtain a sufficient number of human annotators, LLMs can participate as one of the annotators. The study validated these hypotheses through experiments and provided basic findings that are significant for the future development of citation context analysis. Overall, although LLMs cannot completely replace human annotators at present, they can be used as auxiliary tools to improve efficiency in certain scenarios.

Exploring the applicability of Large Language Models to citation context analysis

When Large Language Models Meet Citation: A Survey

Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue

Large Language Models Meet NLP: A Survey

Large Language Models for Data Annotation: A Survey

An Evaluation of Large Language Models in Bioinformatics Research

Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias

Context-Enhanced Language Models for Generating Multi-Paper Citations

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

A Closer Look into Using Large Language Models for Automatic Evaluation

L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?

Comparative Analysis of CHATGPT and the evolution of language models

Exploring the Potential of Large Language Models in Computational Argumentation

Large Language Models for Code Analysis: Do LLMs Really Do Their Job?

AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators

CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

An Examination of the Use of Large Language Models to Aid Analysis of Textual Data

Evaluating Large Language Models in Analysing Classroom Dialogue

Large Language Models for Data Annotation and Synthesis: A Survey

Evaluation of Large Language Model Performance and Reliability for Citations and References in Scholarly Writing: Cross-Disciplinary Study