Abstract:This paper presents Grammar Reinforcement Learning (GRL), a reinforcement learning algorithm that uses Monte Carlo Tree Search (MCTS) and a transformer architecture that models a Pushdown Automaton (PDA) within a context-free grammar (CFG) framework. Taking as use case the problem of efficiently counting paths and cycles in graphs, a key challenge in network analysis, computer science, biology, and social sciences, GRL discovers new matrix-based formulas for path/cycle counting that improve computational efficiency by factors of two to six w.r.t state-of-the-art approaches. Our contributions include: (i) a framework for generating gramformers that operate within a CFG, (ii) the development of GRL for optimizing formulas within grammatical structures, and (iii) the discovery of novel formulas for graph substructure counting, leading to significant computational improvements.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: **How to efficiently calculate the number of paths and cycles in a graph**. Specifically, this problem is of great significance in multiple fields such as network analysis, computer science, biology, and social sciences. Although the traditional matrix formula method can effectively calculate paths of length six and below and cycles of length seven and below, there is still room for improvement in its efficiency. ### Problem Background 1. **Importance of Paths and Cycles** - Paths and cycles are fundamental structures in graph theory and are widely used in fields such as network analysis, chemistry, computer science, biology, and social sciences. - Efficiently calculating paths and cycles of different lengths is crucial for understanding the connectivity and redundancy of a graph and is also the basis of many graph - processing algorithms, including some recent graph neural networks (GNNs). 2. **Limitations of Existing Methods** - Existing matrix formula methods (such as the formula proposed in [11]) show high efficiency in calculating paths of length six and below and cycles of length seven and below. - However, these methods are less efficient in calculating longer paths and cycles and have theoretical limitations (for example, 3 - WL cannot calculate cycles of length over seven). ### Core Contributions of the Paper To overcome the limitations of existing methods, this paper proposes **Grammar Reinforcement Learning (GRL)**, which is a reinforcement learning algorithm that combines Monte Carlo Tree Search (MCTS) and the Transformer architecture. By operating within the context - free grammar (CFG) framework, GRL can discover new matrix formulas to calculate the number of paths and cycles more efficiently. ### Main Results 1. **Proposing the GRL Algorithm** - GRL can generate efficient path and cycle counting formulas within the CFG framework. - GRL not only recovers existing formulas but also discovers new formulas, with a 2 - to - 6 - fold improvement in computational efficiency. 2. **Introducing the Gramformer Model** - Gramformer is a model based on the Transformer architecture that can simulate a push - down automaton (PDA) and learn policy and value functions within the CFG framework. 3. **Discovering New Path and Cycle Counting Formulas** - The new formulas significantly improve computational efficiency and reduce time complexity. ### Conclusion This paper demonstrates the potential of deep - learning algorithms in discovering efficient path and cycle counting formulas, especially in the application within the CFG framework. Future research can further explore more complex grammars to break through current theoretical limitations and apply GRL to actual datasets to improve its applicability and effectiveness in various tasks. ### Related Formulas The path and cycle counting formulas mentioned in the paper are as follows: - **3 - Path Formula** \[ P_3 = J \odot A^3-(I \odot A^2)A - A(I \odot A^2)+A \] - **Improved Path Formulas** \[ P_2^* = J \odot A^2 \] \[ P_3^* = J \odot(A(J \odot A^2))-A \odot(AJ) \] \[ P_4^* = J \odot(A(J \odot(A(J \odot A^2))))-J \odot(A(A \odot(AJ)))-J \odot((A \odot(AJ))A)-A \odot((A \odot A^2)J)+2A \odot A^2 \] These formulas significantly improve computational efficiency by reducing the number of matrix multiplications.

Finding path and cycle counting formulae in graphs with Deep Reinforcement Learning

Learning to Cooperate: Application of Deep Reinforcement Learning for Online AGV Path Finding.

G-PCGRL: Procedural Graph Data Generation via Reinforcement Learning

Graph learning-based generation of abstractions for reinforcement learning

A Deep Reinforcement Learning Agent for Geometry Online Tutoring

GRL: Knowledge graph completion with GAN-based reinforcement learning

Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments

RLgraph: Modular Computation Graphs for Deep Reinforcement Learning

PCGRL+: Scaling, Control and Generalization in Reinforcement Learning Level Generators

Graph Reinforcement Learning for Power Grids: A Comprehensive Survey

Reinforcement Learning Based Monte Carlo Tree Search for Temporal Path Discovery

Research on Knowledge Graph Completion Model Combining Temporal Convolutional Network and Monte Carlo Tree Search

RoboGrammar

Reinforced Molecular Optimization with Neighborhood-Controlled Grammars

Reinforcement Learning Discovers Efficient Decentralized Graph Path Search Strategies

Routing optimization with Monte Carlo Tree Search-based multi-agent reinforcement learning

REINAM: Reinforcement Learning for Input-Grammar Inference.

Model-Free Generative Replay for Lifelong Reinforcement Learning: Application to Starcraft-2

Autonomous Learning and Navigation of Mobile Robots Based on Deep Reinforcement Learning

Feudal Graph Reinforcement Learning

Constructing Ancestral Recombination Graphs through Reinforcement Learning