Resource-rational reinforcement learning and sensorimotor causal states, and resource-rational maximiners

Sarah Marzen
2024-06-28
Abstract:We propose a new computational-level objective function for theoretical biology and theoretical neuroscience that combines: reinforcement learning, the study of learning with feedback via rewards; rate-distortion theory, a branch of information theory that deals with compressing signals to retain relevant information; and computational mechanics, the study of minimal sufficient statistics of prediction also known as causal states. We highlight why this proposal is likely only an approximation, but is likely to be an interesting one, and propose a new algorithm for evaluating it to obtain the newly-coined ``reward-rate manifold''. The performance of real and artificial agents in partially observable environments can be newly benchmarked using these reward-rate manifolds. Finally, we describe experiments that can probe whether or not biological organisms are resource-rational reinforcement learners, using as an example maximin strategies, as bacteria have been shown to be approximate maximiners -- doing their best in the worst-case environment, regardless of what is actually happening.
Neurons and Cognition
What problem does this paper attempt to address?