Deep Reinforcement Learning in POMDPs for 3-D Palletization Problem

Ai Bo,Junguo Lu,Chunyu Zhao
DOI: https://doi.org/10.1109/cac57257.2022.10054950
2022-01-01
Abstract:Online 3D palletization problem is a generic variant in the family of bin packing problem (BPP). However, conventional deep reinforcement learning (DRL) methods merely have an excellent performance on combinatorial optimization problem modeled as Markov decision process (MDP). Since online BPP only provides information on fragments in successive items sequence, it is hard to describe online 3D palletization problem as MDP. Thereby, we formulated online 3D palletization problem as partially observable Markov decision processes (POMDPs) and proposed a novel DRL method to estimate state with observations trajectories. We also devised a DRL framework and train agents on environments with different boxes types. The result shows that our method is effective in a range of experimental settings and achieves higher space utilization than conventional heuristic algorithms.
What problem does this paper attempt to address?