Experience-driven discovery of planning strategies

Ruiqi He,Falk Lieder
2024-12-04
Abstract:One explanation for how people can plan efficiently despite limited cognitive resources is that we possess a set of adaptive planning strategies and know when and how to use them. But how are these strategies acquired? While previous research has studied how individuals learn to choose among existing strategies, little is known about the process of forming new planning strategies. In this work, we propose that new planning strategies are discovered through metacognitive reinforcement learning. To test this, we designed a novel experiment to investigate the discovery of new planning strategies. We then present metacognitive reinforcement learning models and demonstrate their capability for strategy discovery as well as show that they provide a better explanation of human strategy discovery than alternative learning mechanisms. However, when fitted to human data, these models exhibit a slower discovery rate than humans, leaving room for improvement.
Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how people can discover new planning strategies. Although human cognitive resources are limited, people are often able to plan efficiently in daily life, from simple dinner arrangements to complex career plans. This efficient planning ability is partly attributed to people using some clever strategies to assist the planning process. However, it is still unclear how these strategies are discovered and formed. In particular, there is relatively little research on the process of how people form new planning strategies. Therefore, by introducing a new model based on Metacognitive Reinforcement Learning (MCRL), this paper aims to explore and explain how people discover new planning strategies in an experience - driven manner. To study this problem, the author designed a novel experimental task, requiring participants to learn a new strategy in a specific task environment that they may not have known before. By analyzing the participants' click - behavior and other data, the author evaluated the performance of the MCRL model in explaining the human strategy discovery process and compared it with other learning mechanisms. The research shows that although the MCRL model can well simulate the strategy discovery process, its speed of discovering new strategies is still slower than that of humans, indicating that there is still room for improvement in the model. In addition, the study also found differences in learning mechanisms among individuals. Some people may rely more on habitual behavior patterns rather than value - oriented learning. These findings are of great significance for understanding the human metacognitive learning process and developing more advanced artificial intelligence systems.