Abstract:Social intelligence and Theory of Mind (ToM), i.e., the ability to reason about the different mental states, intents, and reactions of all people involved, allow humans to effectively navigate and understand everyday social interactions. As NLP systems are used in increasingly complex social situations, their ability to grasp social dynamics becomes crucial. In this work, we examine the open question of social intelligence and Theory of Mind in modern NLP systems from an empirical and theory-based perspective. We show that one of today's largest language models (GPT-3; Brown et al., 2020) lacks this kind of social intelligence out-of-the box, using two tasks: SocialIQa (Sap et al., 2019), which measures models' ability to understand intents and reactions of participants of social interactions, and ToMi (Le et al., 2019), which measures whether models can infer mental states and realities of participants of situations. Our results show that models struggle substantially at these Theory of Mind tasks, with well-below-human accuracies of 55% and 60% on SocialIQa and ToMi, respectively. To conclude, we draw on theories from pragmatics to contextualize this shortcoming of large language models, by examining the limitations stemming from their data, neural architecture, and training paradigms. Challenging the prevalent narrative that only scale is needed, we posit that person-centric NLP approaches might be more effective towards neural Theory of Mind. In our updated version, we also analyze newer instruction tuned and RLFH models for neural ToM. We find that even ChatGPT and GPT-4 do not display emergent Theory of Mind; strikingly even GPT-4 performs only 60% accuracy on the ToMi questions related to mental states and realities.

Inverse Reinforcement Learning as the Algorithmic Basis for Theory of Mind: Current Methods and Open Problems

Theory of Mind as Intrinsic Motivation for Multi-Agent Reinforcement Learning

Towards Theoretical Understanding of Inverse Reinforcement Learning

On computational models of theory of mind and the imitative reinforcement learning in spiking neural networks

Theory of Mind and Preference Learning at the Interface of Cognitive Science, Neuroscience, and AI: A Review

Multiagent Inverse Reinforcement Learning via Theory of Mind Reasoning

Mind the gap: Challenges of deep learning approaches to Theory of Mind

Bayesian Inverse Reinforcement Learning for Non-Markovian Rewards

On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models

Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience

Interpretable Reinforcement Learning Inspired by Piaget's Theory of Cognitive Development

Reinforcement Learning and Inverse Reinforcement Learning with System 1 and System 2

A Brain-Inspired Model of Theory of Mind

Unveiling the latent dynamics in social cognition with multi-agent inverse reinforcement learning

An Information-Theoretic Perspective on Intrinsic Motivation in Reinforcement Learning: A Survey

Modeling Theory of Mind in Multi-Agent Games Using Adaptive Feedback Control

On the Effective Horizon of Inverse Reinforcement Learning

Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs

It Takes Two to Tango: Towards Theory of AI's Mind

Modeling Moral Choices in Social Dilemmas with Multi-Agent Reinforcement Learning

Experiments in Artificial Theory of Mind: From Safety to Story-Telling