A Novel Ping-pong Task Strategy Based on Model-free Multi-dimensional Q-function Deep Reinforcement Learning

Hongxu Ma,Jianyin Fan,Qiang Wang
DOI: https://doi.org/10.1109/ICSAI57119.2022.10005466
2022-01-01
Abstract:Deep reinforcement learning has been widely used in table tennis decision-making tasks, but most methods have their own defects, such as relying on high-precision trajectory prediction work or requiring targeted class training, etc. None of the methods can directly get a complete hitting strategy through the initial state of the ball. In this paper, we train a ping-pong hitting policy controller with model-free reinforcement learning. By extending the multi-dimensional Q-function, the prediction part of the table tennis task and the batting strategy part are integrated, the trajectory prediction work and the batting work are completed by a single network, which simplifies the complex prediction process and does not need to build a complex dynamics model or train a neural network to predict the trajectory of ping-pong balls. In this way, the deep reinforcement learning process and supervised trajectory prediction training process are organized into a single process. The experimental results show that the best convergence effect can be basically achieved in 50,000 rounds of training. 10,000 tests were performed as a test set with a success rate of over 99%.
What problem does this paper attempt to address?