TAP-Net: Transport-and-Pack using Reinforcement Learning

Ruizhen Hu,Juzhan Xu,Bin Chen,Minglun Gong,Hao Zhang,Hui Huang
DOI: https://doi.org/10.1145/3414685.3417796
2020-09-03
Abstract:We introduce the transport-and-pack(TAP) problem, a frequently encountered instance of real-world packing, and develop a neural optimization solution based on reinforcement learning. Given an initial spatial configuration of boxes, we seek an efficient method to iteratively transport and pack the boxes compactly into a target container. Due to obstruction and accessibility constraints, our problem has to add a new search dimension, i.e., finding an optimal transport sequence, to the already immense search space for packing alone. Using a learning-based approach, a trained network can learn and encode solution patterns to guide the solution of new problem instances instead of executing an expensive online search. In our work, we represent the transport constraints using a precedence graph and train a neural network, coined TAP-Net, using reinforcement learning to reward efficient and stable packing. The network is built on an encoder-decoder architecture, where the encoder employs convolution layers to encode the box geometry and precedence graph and the decoder is a recurrent neural network (RNN) which inputs the current encoder output, as well as the current box packing state of the target container, and outputs the next box to pack, as well as its orientation. We train our network on randomly generated initial box configurations, without supervision, via policy gradients to learn optimal TAP policies to maximize packing efficiency and stability. We demonstrate the performance of TAP-Net on a variety of examples, evaluating the network through ablation studies and comparisons to baselines and alternative network designs. We also show that our network generalizes well to larger problem instances, when trained on small-sized inputs.
Graphics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the **Transport - and - Pack (TAP) problem**, which is a common packing problem in the real world. Specifically, given a set of boxes with initial spatial configurations, the goal is to load these boxes into a target container efficiently and compactly. Due to occlusion and reachability constraints in the physical process, the TAP problem requires not only optimizing the packing order but also finding the optimal transport sequence. ### Problem Background 1. **Packing Requirements in the Real World**: In practical applications such as robot - assisted packaging and transportation, objects are usually already in a certain physical arrangement. For example, the stacking state formed by inventory accumulation over time. In this case, the movement of objects must be orderly and follow a partial order, that is, an object can only be processed after all the objects above it have been removed. 2. **Additional Search Dimension**: Different from traditional combinatorial optimization problems that only focus on the final packing state, the TAP problem introduces an additional search dimension - finding the optimal transport sequence. This makes the problem more complex because it is necessary to consider not only how to place objects compactly but also how to effectively transport objects from their initial positions to the target container. ### Solutions Proposed in the Paper To address the complexity of the TAP problem, the authors propose a neural combinatorial optimization method based on Reinforcement Learning (RL), called **TAP - Net**. The main features of this method include: 1. **Dynamic Input Representation**: TAP - Net uses a precedence graph to represent transport constraints and a height map to represent the packing state in the current target container. These two graphs change dynamically and are continuously updated as boxes are transported and packed. 2. **Encoder - Decoder Architecture**: The network adopts an encoder - decoder architecture. The encoder uses convolutional layers to encode the geometric shapes and prior information of the boxes, and the decoder is a Recurrent Neural Network (RNN) that predicts the next box to be packed and its direction according to the current encoded output and the packing state in the target container. 3. **Reinforcement Learning Training**: TAP - Net is trained in an unsupervised manner through policy gradients, aiming to learn the optimal TAP strategy to maximize packing efficiency and stability. The reward function is defined based on the compactness and stability of packing. 4. **Incremental Solution**: TAP - Net can incrementally construct action sequences. At each step, it selects a box and determines its direction, and then packs it into the target container. This incremental method enables the network to better handle larger problem instances. ### Summary This paper solves the complexity of the transportation and packing problems in the real world by introducing TAP - Net. By combining reinforcement learning and neural networks, TAP - Net can efficiently find the optimal transportation and packing order while considering occlusion and reachability constraints, thereby achieving a compact and stable packing effect.