Softened Symbol Grounding for Neuro-symbolic Systems

Zenan Li,Yuan Yao,Taolue Chen,Jingwei Xu,Chun Cao,Xiaoxing Ma,Jian Lü
2024-03-01
Abstract:Neuro-symbolic learning generally consists of two separated worlds, i.e., neural network training and symbolic constraint solving, whose success hinges on symbol grounding, a fundamental problem in AI. This paper presents a novel, softened symbol grounding process, bridging the gap between the two worlds, and resulting in an effective and efficient neuro-symbolic learning framework. Technically, the framework features (1) modeling of symbol solution states as a Boltzmann distribution, which avoids expensive state searching and facilitates mutually beneficial interactions between network training and symbolic reasoning;(2) a new MCMC technique leveraging projection and SMT solvers, which efficiently samples from disconnected symbol solution spaces; (3) an annealing mechanism that can escape from %being trapped into sub-optimal symbol groundings. Experiments with three representative neuro symbolic learning tasks demonstrate that, owining to its superior symbol grounding capability, our framework successfully solves problems well beyond the frontier of the existing proposals.
Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the **Symbol Grounding problem** in the neural - symbolic system. Specifically, the neural - symbolic system usually consists of two independent worlds: neural network training and symbolic constraint solving. The success of these two worlds depends on symbol grounding, which is a fundamental problem in artificial intelligence. This paper proposes a new softened symbol - grounding process, bridging the gap between these two worlds and forming an efficient and effective neural - symbolic learning framework. #### Main challenges 1. **Semantic gap**: Neural learning is stochastic and continuous, while symbolic reasoning is deterministic and discrete. This difference leads to difficulties in symbol grounding. 2. **Sparsity problem**: In the symbolic space, feasible solutions are very sparse and poorly connected between different solutions, which makes it difficult for traditional Markov Chain Monte Carlo (MCMC) sampling to efficiently explore the solution space. 3. **Initial model dependence**: Existing methods perform poorly without a good initial model. #### Solutions 1. **Softening symbol distribution**: By optimizing the Boltzmann distribution of symbols, instead of directly searching for a deterministic input - symbol mapping, gradually converge to a deterministic mapping. 2. **Projection technique**: Use projection techniques to accelerate random walks, combined with a Satisfiability Modulo Theories (SMT) solver, to overcome the connectivity barriers in the solution space. 3. **Annealing mechanism**: Gradually reduce the temperature parameter \(\gamma\) through an annealing strategy to help the model escape from sub - optimal symbol grounding. ### Formulas - **Boltzmann distribution**: \[ Q_\phi(z)=\frac{P_\theta(z|x)^{1 / \gamma}}{\sum_{z' \in S_y} P_\theta(z'|x)^{1 / \gamma}} \] where \(P_\theta(z|x)\) is the variational probability distribution generated by the neural network, and \(\gamma\) is the temperature parameter that controls the sharpness of the distribution. - **Loss function**: \[ \ell(\theta, \phi)=-\sum_{i = 1}^N\sum_{z_i \in S_{y_i}} Q_\phi(z_i)\log P_\theta(z_i|x_i)+\gamma\sum_{z_i \in S_{y_i}} Q_\phi(z_i)\log Q_\phi(z_i) \] - **Acceptance rate**: \[ \tau=\left(\frac{P_\theta(z'|x)}{P_\theta(z|x)}\right)^{1 / \gamma} \] ### Experimental results The experiments were verified on three representative tasks, including Handwritten Formula Evaluation (HWF), Visual Sudoku Classification (Sudoku), and Single - Destination Shortest - Path Prediction in Weighted Graphs (SDSP). The results show that the framework proposed in this paper is significantly superior to existing methods in terms of symbol - grounding ability. In particular, in the Handwritten Formula Evaluation task, the two - stage algorithm using the exponential annealing strategy achieved the best performance in both symbol accuracy and computational accuracy. ### Summary This paper effectively solves the symbol - grounding problem in the neural - symbolic system by introducing a softened symbol - grounding process, improving the generalization ability and efficiency of the model.