Improving the performance of Learned Controllers in Behavior Trees using Value Function Estimates at Switching Boundaries

Mart Kartaev,Petter gren,Mart Kartašev,Petter Ögren
DOI: https://doi.org/10.1109/lra.2024.3382477
IF: 5.2
2024-01-01
IEEE Robotics and Automation Letters
Abstract:Behavior trees offer a modular approach to developing an overall controller from a set of sub-controllers that solve different sub-problems. These sub-controllers can be created using various methods, such as classical model-based control or reinforcement learning (RL). To achieve the overall goal, each sub-controller must satisfy the preconditions of the next sub-controller. Although every sub-controller may be locally optimal in achieving the preconditions of the next one, given some performance metric like completion time, the overall controller may still not be optimal with respect to the same performance metric. In this paper, we demonstrate how the performance of the overall controller can be improved if we use approximations of value functions to inform the design of a sub-controller of the needs of the next controller. We also show how, under certain assumptions, this leads to a globally optimal controller when the process is executed on all sub-controllers. Finally, this result also holds when some of the sub-controllers are already given. This means that if we are constrained to use some existing sub-controllers, the overall controller will be globally optimal, given this constraint.
robotics
What problem does this paper attempt to address?