Neural fitted actor-critic.

Matthieu Zimmer,Yann Boniface,Alain Dutech
2016-01-01
Abstract:A novel reinforcement learning algorithm that deals with both continuous state and action spaces is proposed. Domain knowledge requirements are kept minimal by using non-linear estimators and since the algorithm does not need prior trajectories or known goal states. The new actor-critic algorithm is on-policy, offline and model-free. It considers discrete time, stationary policies, and maximizes the discounted sum of rewards. Experimental results on two common environments, showing the good performance of the proposed algorithm, are presented.
What problem does this paper attempt to address?