Inferring Human Vision in a Human-Like Way: Key Factors Influencing the Cognitive Processing of Level-1 Visual Perspective-Taking

Song Zhou,Huaqi Yang,Ming Ye,Ning Ding,Tao Liu
DOI: https://doi.org/10.1177/00936502241302569
IF: 6.3
2024-01-01
Communication Research
Abstract:The advancement of artificial intelligence (AI) has expanded the potential for human-machine communication and collaboration in complex contexts, necessitating AI to exhibit human-like behavior in order to align with its human counterpart. Consequently, understanding human behavioral traits becomes advantageous for developing AI agents that resemble humans. This study investigated how individuals process visual information from others to inform the future design of intelligent vision systems. Through four experiments, participants were tasked with assessing whether a given number corresponds to the number of balls while manipulating the gaze direction of an avatar by averting its eyes or altering its head orientation. The results indicate that participant response times were influenced regardless of the avatar’s gaze direction. Specifically, when the avatar was positioned with its back facing the balls, any disparity in participant performance across different conditions is eliminated. These findings suggest that implicit level-1 visual perspective-taking may not primarily rely on gaze direction but rather on perceiving affordances within the environment. Such insights contribute to a deeper understanding of cognitive mechanisms underlying level-1 visual perspective-taking and can serve as a theoretical foundation for advancing AI vision algorithms in human-machine communication and collaboration.
What problem does this paper attempt to address?