Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models

Courtland Leer,Vincent Trost,Vineeth Voruganti

DOI: https://doi.org/10.48550/arXiv.2310.06983

2023-10-11

Abstract:Recent research shows that Large Language Models (LLMs) exhibit a compelling level of proficiency in Theory of Mind (ToM) tasks. This ability to impute unobservable mental states to others is vital to human social cognition and may prove equally important in principal-agent relations between individual humans and Artificial Intelligences (AIs). In this paper, we explore how a mechanism studied in developmental psychology known as Violation of Expectation (VoE) can be implemented to reduce errors in LLM prediction about users by leveraging emergent ToM affordances. And we introduce a \textit{metacognitive prompting} framework to apply VoE in the context of an AI tutor. By storing and retrieving facts derived in cases where LLM expectation about the user was violated, we find that LLMs are able to learn about users in ways that echo theories of human learning. Finally, we discuss latent hazards and augmentative opportunities associated with modeling user psychology and propose ways to mitigate risk along with possible directions for future inquiry.

Computation and Language,Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the prediction error problem of large - language models (LLMs) in the Theory of Mind (ToM) tasks. Specifically, the author explores a mechanism, that is, implementing the Violation of Expectation (VoE) mechanism in developmental psychology through metacognitive prompting, in order to reduce the errors of LLMs when predicting user behaviors. By storing and retrieving the facts deduced when the expectations of LLMs for users are violated, the research finds that LLMs can understand users in a way similar to human learning. In addition, the paper also discusses the potential risks and gain opportunities of modeling user psychology, and proposes methods to mitigate risks and directions for future research. The main objectives of the paper are: 1. To demonstrate the general utility of the metacognitive prompting framework in reducing ToM prediction errors in a specific application - Bloom (a free AI tutor). 2. To have an in - depth discussion of opportunities for future work, including the practical and philosophical significance of this emerging ability, and how to use confidential computing environments to protect the security of these mental renderings.

Violation of Expectation via Metacognitive Prompting Reduces Theory of Mind Prediction Error in Large Language Models

PHAnToM: Persona-based Prompting Has An Effect on Theory-of-Mind Reasoning in Large Language Models

Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion?

Boosting Theory-of-Mind Performance in Large Language Models via Prompting

SimpleToM: Exposing the Gap between Explicit ToM Inference and Implicit ToM Application in LLMs

Theory of Mind in Large Language Models: Examining Performance of 11 State-of-the-Art models vs. Children Aged 7-10 on Advanced Tests

Metacognitive Prompting Improves Understanding in Large Language Models

What You Need is What You Get: Theory of Mind for an LLM-Based Code Understanding Assistant

Perceptions to Beliefs: Exploring Precursory Inferences for Theory of Mind in Large Language Models

Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

How FaR Are Large Language Models From Agents with Theory-of-Mind?

Constrained Reasoning Chains for Enhancing Theory-of-Mind in Large Language Models

Probing the Robustness of Theory of Mind in Large Language Models

Minding Language Models' (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker

Testing theory of mind in large language models and humans

Do Large Language Models Exhibit Cognitive Dissonance? Studying the Difference Between Revealed Beliefs and Stated Answers

Context-faithful Prompting for Large Language Models

Investigating the Role of Prompting and External Tools in Hallucination Rates of Large Language Models