Measuring Agreeableness Bias in Multimodal Models

Jaehyuk Lim,Bruce W. Lee
2024-10-15
Abstract:This paper examines a phenomenon in multimodal language models where pre-marked options in question images can significantly influence model responses. Our study employs a systematic methodology to investigate this effect: we present models with images of multiple-choice questions, which they initially answer correctly, then expose the same model to versions with pre-marked options. Our findings reveal a significant shift in the models' responses towards the pre-marked option, even when it contradicts their answers in the neutral settings. Comprehensive evaluations demonstrate that this agreeableness bias is a consistent and quantifiable behavior across various model architectures. These results show potential limitations in the reliability of these models when processing images with pre-marked options, raising important questions about their application in critical decision-making contexts where such visual cues might be present.
Artificial Intelligence,Computation and Language,Computer Vision and Pattern Recognition,Human-Computer Interaction
What problem does this paper attempt to address?