Reward is not enough: can we liberate AI from the reinforcement learning paradigm?

Vacslav Glukhov
2024-11-11
Abstract:I present arguments against the hypothesis put forward by Silver, Singh, Precup, and Sutton ( <a class="link-external link-https" href="https://www.sciencedirect.com/science/article/pii/S0004370221000862" rel="external noopener nofollow">this https URL</a> ) : reward maximization is not enough to explain many activities associated with natural and artificial intelligence including knowledge, learning, perception, social intelligence, evolution, language, generalisation and imitation. I show such reductio ad lucrum has its intellectual origins in the political economy of Homo economicus and substantially overlaps with the radical version of behaviourism. I show why the reinforcement learning paradigm, despite its demonstrable usefulness in some practical application, is an incomplete framework for intelligence -- natural and artificial. Complexities of intelligent behaviour are not simply second-order complications on top of reward maximisation. This fact has profound implications for the development of practically usable, smart, safe and robust artificially intelligent agents.
Artificial Intelligence
What problem does this paper attempt to address?