Graph Q-Learning for Combinatorial Optimization

Victoria M. Dax,Jiachen Li,Kevin Leahy,Mykel J. Kochenderfer
2024-01-11
Abstract:Graph-structured data is ubiquitous throughout natural and social sciences, and Graph Neural Networks (GNNs) have recently been shown to be effective at solving prediction and inference problems on graph data. In this paper, we propose and demonstrate that GNNs can be applied to solve Combinatorial Optimization (CO) problems. CO concerns optimizing a function over a discrete solution space that is often intractably large. To learn to solve CO problems, we formulate the optimization process as a sequential decision making problem, where the return is related to how close the candidate solution is to optimality. We use a GNN to learn a policy to iteratively build increasingly promising candidate solutions. We present preliminary evidence that GNNs trained through Q-Learning can solve CO problems with performance approaching state-of-the-art heuristic-based solvers, using only a fraction of the parameters and training time.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
This paper discusses how to use Graph Neural Networks (GNNs) to solve Combinatorial Optimization (CO) problems. Traditionally, CO problems are difficult to solve exactly due to the large solution space and often rely on heuristic methods. The researchers model the optimization process as a sequential decision-making problem and use GNNs to learn strategies to gradually construct better candidate solutions. The specific contributions include: 1. The first proposal to represent CO problems as graphs and formalize their solutions as Markov Decision Processes (MDPs). 2. Demonstrating the ability of GNNs to solve CO problems through Q-learning and proving that this approach can be generalized to meta-learning. 3. Experimental results show that GNNs perform close to optimal with a fraction of the parameters and training time required by existing heuristic methods. The paper also reviews successful applications of GNNs in prediction and reasoning problems, as well as their initial attempts in reinforcement learning. In the method section, the paper describes how to transform CO problems into MDPs and use GNNs to learn policies that minimize the makespan. Experimental results show that the proposed GNN reinforcement learning method achieves performance close to optimal solvers with shorter runtime and fewer parameters and training time when solving Flexible Job Shop Scheduling Problems (FJSP). In summary, the paper aims to demonstrate the potential of GNNs in solving complex combinatorial optimization problems and provides new perspectives and methods for future work.