Knowledge Graph Reasoning with Self-supervised Reinforcement Learning

Ying Ma,Owen Burns,Mingqiu Wang,Gang Li,Nan Du,Laurent El Shafey,Liqiang Wang,Izhak Shafran,Hagen Soltau
2024-05-22
Abstract:Reinforcement learning (RL) is an effective method of finding reasoning pathways in incomplete knowledge graphs (KGs). To overcome the challenges of a large action space, a self-supervised pre-training method is proposed to warm up the policy network before the RL training stage. To alleviate the distributional mismatch issue in general self-supervised RL (SSRL), in our supervised learning (SL) stage, the agent selects actions based on the policy network and learns from generated labels; this self-generation of labels is the intuition behind the name self-supervised. With this training framework, the information density of our SL objective is increased and the agent is prevented from getting stuck with the early rewarded paths. Our self-supervised RL (SSRL) method improves the performance of RL by pairing it with the wide coverage achieved by SL during pretraining, since the breadth of the SL objective makes it infeasible to train an agent with that alone. We show that our SSRL model meets or exceeds current state-of-the-art results on all Hits@k and mean reciprocal rank (MRR) metrics on four large benchmark KG datasets. This SSRL method can be used as a plug-in for any RL architecture for a KGR task. We adopt two RL architectures, i.e., MINERVA and MultiHopKG as our baseline RL models and experimentally show that our SSRL model consistently outperforms both baselines on all of these four KG reasoning tasks. Full code for the paper available at
Computation and Language,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges encountered during reasoning in the knowledge graph (KG), especially the inference problem of missing information in large - scale knowledge graphs. Specifically, the paper focuses on how to find a reasonable reasoning path in an incomplete information graph to complete the query tasks of the knowledge graph (that is, given the starting entity and relationship, predict the target entity). Since actual knowledge graphs are often unable to include all relevant facts, this problem has important research value. To address this challenge, the author proposes a method that combines self - supervised learning (SL) and reinforcement learning (RL) - self - supervised reinforcement learning (SSRL). This method aims to increase the information density of the policy network in the pre - training phase and improve the exploration efficiency in the reinforcement learning phase, thereby effectively solving the reasoning problems in the knowledge graph. The SSRL method is especially suitable for query tasks with large action spaces and can significantly improve the performance of the model. The main contributions of the paper include: 1. Proposing a new SSRL framework for the query - answering tasks of the knowledge graph. This framework can be used as a plug - in for any RL architecture for knowledge graph reasoning tasks. 2. By analyzing the advantages and disadvantages of SL and RL in terms of coverage, learning speed, and feasibility, a different exploration strategy and training loss function are proposed to solve the distribution mismatch problem in the general SSRL framework. 3. Experimental results show that the proposed SSRL model has reached or exceeded the current state - of - the - art level on all indicators in four large - scale benchmark knowledge graph datasets. In conclusion, through the innovative SSRL method, this paper effectively solves the large - scale action space exploration problem in knowledge graph reasoning and improves the performance of query - answering tasks.