Abstract:Optimization problems over dynamic networks have been extensively studied and widely used in the past decades to formulate numerous real-world problems. However, (1) traditional optimization-based approaches do not scale to large networks, and (2) the design of good heuristics or approximation algorithms often requires significant manual trial-and-error. In this work, we argue that data-driven strategies can automate this process and learn efficient algorithms without compromising optimality. To do so, we present network control problems through the lens of reinforcement learning and propose a graph network-based framework to handle a broad class of problems. Instead of naively computing actions over high-dimensional graph elements, e.g., edges, we propose a bi-level formulation where we (1) specify a desired next state via RL, and (2) solve a convex program to best achieve it, leading to drastically improved scalability and performance. We further highlight a collection of desirable features to system designers, investigate design decisions, and present experiments on real-world control problems showing the utility, scalability, and flexibility of our framework.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are several key challenges in dynamic network control, in particular: 1. **Optimization problems of large - scale networks**: Traditional optimization - based methods are difficult to scale when dealing with large - scale networks, resulting in inefficiency. 2. **Design of efficient algorithms**: Designing effective heuristic or approximate algorithms usually requires a large number of manual trial - and - error processes. 3. **Non - linearity, randomness and multi - stage decision - making**: Traditional methods have limitations when dealing with non - linear dynamics, stochastic systems and the curse of dimensionality in time - extended networks. To address these challenges, the author proposes a two - level optimization framework based on graph networks and reinforcement learning. Specifically, this framework solves the above problems in the following ways: - **Two - level optimization**: The framework adopts a two - level optimization method, where the first level specifies the next desired state through reinforcement learning, and the second level achieves this state by solving a convex optimization problem. This method not only improves scalability but also maintains optimization performance. - **Data - driven strategy**: Use a data - driven method to automate the algorithm design process, thereby learning efficient algorithms without sacrificing optimality. - **Flexibility and adaptability**: The framework can handle various types of dynamic network control problems, including practical applications such as supply chain management and vehicle dynamic route planning. ### Paper contributions The main contributions of the paper include: 1. **Proposing a two - level reinforcement learning method based on graph networks**, which combines the advantages of direct optimization and reinforcement learning. 2. **Exploring the architectural components and design decisions within the framework**, such as the choice of graph aggregation functions, action parameterization, exploration strategies and their impact on system performance. 3. **Demonstrating the advantages of this method in terms of performance, scalability and robustness**, especially in artificial test problems and practical problems (such as supply chain inventory management and dynamic vehicle route planning), this method performs better than classical optimization methods, domain - specific heuristic algorithms and pure end - to - end reinforcement learning methods. ### Specific problem solving - **Large - scale network optimization**: Through the two - level optimization framework, this method can operate efficiently on large - scale networks, avoiding the computational bottlenecks of traditional methods. - **Efficient algorithm design**: The data - driven strategy reduces the need for manual trial - and - error and automatically generates efficient control algorithms. - **Non - linearity, randomness and multi - stage decision - making**: The framework effectively deals with non - linear dynamics, randomness and multi - stage decision - making problems by combining reinforcement learning and optimization methods. In summary, this paper solves several key challenges in dynamic network control by proposing an innovative two - level optimization framework and demonstrates its superior performance and wide applicability in practical applications.

Graph Reinforcement Learning for Network Control via Bi-Level Optimization

Optimal Control for Constrained Discrete-Time Nonlinear Systems Based on Safe Reinforcement Learning.

Controlling Directed Networks with Evolving Topologies

Scalable Reinforcement Learning for Linear-Quadratic Control of Networks

Network Topology Optimization via Deep Reinforcement Learning

Real-time Control of Electric Autonomous Mobility-on-Demand Systems via Graph Reinforcement Learning

Hierarchical Reinforcement Learning for Power Network Topology Control

RLOC: Neurobiologically Inspired Hierarchical Reinforcement Learning Algorithm for Continuous Control of Nonlinear Dynamical Systems

Optimizing the Controllability of Arbitrary Networks with Genetic Algorithm

Network control by a constrained external agent as a continuous optimization problem

Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective

Efficient and scalable reinforcement learning for large-scale network control

Network planning with deep reinforcement learning

Deep Reinforcement Learning meets Graph Neural Networks: exploring a routing optimization use case

Decentralized Routing and Radio Resource Allocation in Wireless Ad Hoc Networks via Graph Reinforcement Learning

Learning to Solve Combinatorial Optimization Problems on Real-World Graphs in Linear Time

Goal-directed graph construction using reinforcement learning

An Optimal Control-Based Distributed Reinforcement Learning Framework for a Class of Non-Convex Objective Functionals of the Multi-Agent Network

Coordinated Reinforcement Learning for Optimizing Mobile Networks

Deep Reinforcement Learning Based Optimal Infinite-Horizon Control of Probabilistic Boolean Control Networks

Multi-Agent Reinforcement Learning for Power Control in Wireless Networks via Adaptive Graphs