Autonomous reinforcement of behavioral sequences in neural dynamics

Sohrob Kazerounian,Matthew Luciw,Mathis Richter,Yulia Sandamirskaya
DOI: https://doi.org/10.1109/ijcnn.2013.6706877
2013-08-01
Abstract:We introduce a dynamic neural algorithm called Dynamic Neural (DN) SARSA $(\lambda)$ for learning a behavioral sequence from delayed reward. DN-SARSA $(\lambda)$ combines Dynamic Field Theory models of behavioral sequence representation, classical reinforcement learning, and a computational neuroscience model of working memory, called Item and Order working memory, which serves as an eligibility trace. DN-SARSA $(\lambda)$ is implemented on both a simulated and real robot that must learn a specific rewarding sequence of elementary behaviors from exploration. Results show DN-SARSA $(\lambda)$ performs on the level of the discrete SARSA $(\lambda)$, validating the feasibility of general reinforcement learning without compromising neural dynamics.
What problem does this paper attempt to address?