Efficient Object Manipulation Planning with Monte Carlo Tree Search

Huaijiang Zhu,Avadesh Meduri,Ludovic Righetti
DOI: https://doi.org/10.48550/arXiv.2206.09023
2023-03-20
Abstract:This paper presents an efficient approach to object manipulation planning using Monte Carlo Tree Search (MCTS) to find contact sequences and an efficient ADMM-based trajectory optimization algorithm to evaluate the dynamic feasibility of candidate contact sequences. To accelerate MCTS, we propose a methodology to learn a goal-conditioned policy-value network to direct the search towards promising nodes. Further, manipulation-specific heuristics enable to drastically reduce the search space. Systematic object manipulation experiments in a physics simulator and on real hardware demonstrate the efficiency of our approach. In particular, our approach scales favorably for long manipulation sequences thanks to the learned policy-value network, significantly improving planning success rate.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to efficiently plan the contact sequence for object manipulation in robotic manipulation. Specifically, the paper focuses on how to find a series of dynamically feasible contact points and forces in the case of a given target object motion trajectory, so that the robot can effectively perform complex manipulation tasks. This problem is challenging in robotics because it is necessary to consider both the discrete changes in contact modes and continuous dynamic constraints simultaneously, which usually leads to intractable combinatorial optimization problems. To solve the above - mentioned problems, the paper proposes a method based on Monte Carlo Tree Search (MCTS), combined with an efficient ADMM (Alternating Direction Method of Multipliers) trajectory optimization algorithm to evaluate the dynamic feasibility of candidate contact sequences. In addition, in order to accelerate MCTS, the paper proposes a method to train the target - conditioned policy - value network and the feasibility classifier to guide the search towards more promising nodes. These methods work together, enabling the scheme proposed in the paper to perform well in complex and long - cycle tasks and significantly improve the success rate of planning.