Abstract:Although neural networks have achieved great successes in various machine learning tasks, people can hardly know what neural networks learn from data due to their black-box nature. The lack of such explainability is one of the limitations of neural networks when applied in domains, e.g., healthcare and finance, that demand transparency and accountability. Moreover, explainability is beneficial for guiding a neural network to learn the causal patterns that can extrapolate out-of-distribution (OOD) data, which is critical in real-world applications and has surged as a hot research topic. In order to improve the explainability of neural networks, we propose a novel method—Explainable Neural Rule Learning (denoted as ENRL), with the aim to integrate the expressiveness of neural networks and the explainability of rule-based systems. Specifically, we first design several operator modules and guide them to behave as certain relational operators via self-supervised learning. With input feature fields and learnable context values serving as arguments, these operator modules are used as predicates to constitute the atomic propositions. Then we employ neural logical operations to combine atomic propositions into a collection of rules. Finally, we design a voting mechanism for these rules so that they collaboratively make up our predictive model. Thus, rule learning is transformed to neural architecture search, that is, to choose the appropriate arrangements of feature fields and operator modules. After searching for a specific architecture and learning the involved modules, the resulting neural network explicitly expresses some rules and thus possesses explainability. Therefore, we can predict for each input instance according to rules it satisfies, which at the same time explains how the neural network makes that decision. We conduct a series of experiments on both synthetic and real-world datasets to evaluate ENRL. Compared with conventional neural networks, ENRL achieves competitive in-distribution performance while providing the extra benefits of explainability. Meanwhile, ENRL significantly alleviates performance drop on OOD test data, implying the effectiveness of rule learning. Codes are provided at https://github.com/Shuriken13/ENRL.

Interpretable and Explainable Logical Policies via Neurally Guided Symbolic Abstraction

Three Pathways to Neurosymbolic Reinforcement Learning with Interpretable Model and Policy Networks

Learning Symbolic Rules for Interpretable Deep Reinforcement Learning

EXPIL: Explanatory Predicate Invention for Learning in Games

BlendRL: A Framework for Merging Symbolic and Neural Policy Learning

A Neuro-Symbolic Approach to Multi-Agent RL for Interpretability and Probabilistic Decision Making

End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations

Explaining the Behaviour of Reinforcement Learning Agents in a Multi-Agent Cooperative Environment Using Policy Graphs

Symbolic Visual Reinforcement Learning: A Scalable Framework with Object-Level Abstraction and Differentiable Expression Search

Why? Why not? When? Visual Explanations of Agent Behavior in Reinforcement Learning

Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study From an Optimization Perspective

Distilling Reinforcement Learning Policies for Interpretable Robot Locomotion: Gradient Boosting Machines and Symbolic Regression

Interpretable Local Tree Surrogate Policies

REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Learning Two-Step Hybrid Policy for Graph-Based Interpretable Reinforcement Learning

Counterfactual Explanation Policies in RL

Explainable Neural Rule Learning

Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning

Off-Policy Differentiable Logic Reinforcement Learning

Policy Learning with a Language Bottleneck