Investigating Context Effects in Similarity Judgements in Large Language Models

Sagar Uprety,Amit Kumar Jaiswal,Haiming Liu,Dawei Song
2024-08-20
Abstract:Large Language Models (LLMs) have revolutionised the capability of AI models in comprehending and generating natural language text. They are increasingly being used to empower and deploy agents in real-world scenarios, which make decisions and take actions based on their understanding of the context. Therefore researchers, policy makers and enterprises alike are working towards ensuring that the decisions made by these agents align with human values and user expectations. That being said, human values and decisions are not always straightforward to measure and are subject to different cognitive biases. There is a vast section of literature in Behavioural Science which studies biases in human judgements. In this work we report an ongoing investigation on alignment of LLMs with human judgements affected by order bias. Specifically, we focus on a famous human study which showed evidence of order effects in similarity judgements, and replicate it with various popular LLMs. We report the different settings where LLMs exhibit human-like order effect bias and discuss the implications of these findings to inform the design and development of LLM based applications.
Artificial Intelligence
What problem does this paper attempt to address?
The problem this paper attempts to address is whether large language models (LLMs) exhibit order effect biases similar to humans in similarity judgments. Specifically, the authors replicate a famous study by Tversky and Gati, which demonstrated the presence of order effects in human similarity judgments, meaning that the order in which objects are presented affects the judgment of their similarity. The authors repeat this experiment using various popular LLMs to explore whether these models also exhibit similar human-like order effect biases under different settings and discuss the implications of these findings for the design and development of LLM-based applications. The main research questions of the paper are: - To what extent do large language models align with human judgments in context-sensitive similarity assessments? - Under what conditions do LLMs exhibit order effect biases similar to those of humans?