Enhanced Probabilistic Inference Algorithm Using Probabilistic Neural Networks For Learning Control

Yang Li,Shijie Guo,Lishuang Zhu,Toshiharu Mukai,Zhongxue Gan
DOI: https://doi.org/10.1109/ACCESS.2019.2959876
IF: 3.9
2019-01-01
IEEE Access
Abstract:In model-based methods of reinforcement learning (RL), the probabilistic inference for learning control (PILCO) algorithm, which relies on Gaussian process (GP) for building probabilistic dynamics models, is getting attention for its advantages of requiring only a small amount of data, and being able to learn from scratch in a few trials by explicitly incorporating model uncertainty into long-term planning. However, the time complexity of the GP, which is cubic with respect to the number of trials, makes it challenging to scale the framework to high-dimensional observation spaces. Moreover, the cost function of a task is limited as a locally quadratic function for calculating the policy gradient analytically. To solve these problems, we proposed a probabilistic neural network (PNN) to replace the GP in building probabilistic dynamics models and develop a deterministic control policy by using long-term predictions. In particular, we determine model uncertainty through basic prior knowledge and calculate cumulative cost by sampling from state distributions. This approach can help reduce the influence of model error and time consumption. Compared with the state-of-the-art model-based RL, the proposed approach can reconcile data efficiency and speed of learning even in high-dimensional observation spaces.
What problem does this paper attempt to address?