Exploiting Generalization in the Subspaces for Faster Model-Based Reinforcement Learning

Maryam Hashemzadeh,Reshad Hosseini,Majid Nili Ahmadabadi
DOI: https://doi.org/10.1109/TNNLS.2018.2869978
Abstract:Due to the lack of enough generalization in the state space, common methods of reinforcement learning suffer from slow learning speed, especially in the early learning trials. This paper introduces a model-based method in discrete state spaces for increasing the learning speed in terms of required experiences (but not required computation time) by exploiting generalization in the experiences of the subspaces. A subspace is formed by choosing a subset of features in the original state representation. Generalization and faster learning in a subspace are due to many-to-one mapping of experiences from the state space to each state in the subspace. Nevertheless, due to inherent perceptual aliasing (PA) in the subspaces, the policy suggested by each subspace does not generally converge to the optimal policy. Our approach, called model-based learning with subspaces (MoBLeSs), calculates the confidence intervals of the estimated Q -values in the state space and in the subspaces. These confidence intervals are used in the decision-making, such that the agent benefits the most from the possible generalization while avoiding from the detriment of the PA in the subspaces. The convergence of MoBLeS to the optimal policy is theoretically investigated. In addition, we show through several experiments that MoBLeS improves the learning speed in the early trials.
What problem does this paper attempt to address?