Mitigating Dimensionality in 2D Rectangle Packing Problem under Reinforcement Learning Schema

Waldemar Kołodziejczyk,Mariusz Kaleta
DOI: https://doi.org/10.17388/WUT.2024.0002.MiNI
2024-09-15
Abstract:This paper explores the application of Reinforcement Learning (RL) to the two-dimensional rectangular packing problem. We propose a reduced representation of the state and action spaces that allow us for high granularity. Leveraging UNet architecture and Proximal Policy Optimization (PPO), we achieved a model that is comparable to the MaxRect heuristic. However, our approach has great potential to be generalized to nonrectangular packing problems and complex constraints.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?
The paper aims to address the 2D Rectangle Strip Packing Problem, a classic NP-hard problem with applications in many practical fields. Specifically, the goal of the study is to densely place a set of rectangles within a strip of given width. To simplify the problem, it is assumed that the height of the strip is sufficiently large, and a fixed-size container is considered throughout the paper. The focus of the paper is on the online version of the problem, where rectangles are processed in descending order of their area. The main contribution of the paper is the proposal of a method based on Reinforcement Learning (RL) to solve this problem, employing a reduced state and action space representation to achieve high accuracy. By combining the UNet architecture and Proximal Policy Optimization (PPO), the researchers developed a model with performance comparable to the MaxRects heuristic algorithm. Additionally, this method has significant potential to be extended to non-rectangular packing problems and applications with complex constraints. In the experimental section, the researchers tested 500 episodes, each containing 15 elements arranged in descending order of area. The experimental results indicate that in some cases, the proposed RL method even outperforms the traditional MaxRects algorithm, especially when dealing with random element sets. Overall, despite challenges in full 2D representation, this reduced 1D method demonstrates potential advantages in the 2D Rectangle Strip Packing Problem and provides a foundation for further exploration of non-rectangular packing problems.