PruneSymNet: A Symbolic Neural Network and Pruning Algorithm for Symbolic Regression

Min Wu,Weijun Li,Lina Yu,Wenqiang Li,Jingyi Liu,Yanjie Li,Meilan Hao
2024-01-25
Abstract:Symbolic regression aims to derive interpretable symbolic expressions from data in order to better understand and interpret data. %which plays an important role in knowledge discovery and interpretable machine learning.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve several key problems in symbolic regression to improve the interpretability and accuracy of the model. Specifically, the paper proposes a new symbolic neural network named PruneSymNet and its corresponding pruning algorithm, mainly solving the following problems: 1. **Interpretability of symbolic expressions**: - The goal of symbolic regression is to derive interpretable symbolic expressions from data for better understanding and interpretation of the data. Although existing deep neural networks have strong fitting capabilities, they are uninterpretable black - box models. Therefore, researchers hope to simplify the neural network structure so that it can be described by simple mathematical expressions, thus obtaining an interpretable regression model. 2. **Simplification of complex expressions**: - Traditional sparse optimization methods are difficult to obtain simple enough expressions. For example, the EQL network still contains a large number of non - zero parameters after sparse optimization, resulting in overly complex expressions and reduced interpretability. To solve this problem, the paper proposes a greedy pruning algorithm. By gradually removing redundant parts, a concise sub - network is finally obtained, thus obtaining a simpler expression. 3. **Training stability**: - Element functions and operators (especially the division operator) as activation functions are prone to cause gradient explosion and an unstable training process. For this reason, the paper introduces an improved gradient descent method to ensure that the network can be stably trained even if each layer contains a division operator. 4. **The problem of local optimal solutions**: - In the greedy pruning process, it may fall into local optimal solutions. To alleviate this problem, the paper combines beam search. Each time pruning, multiple candidate expressions are retained, and finally the expression with the smallest error is selected as the result. In addition, randomness is introduced to increase the exploration range and avoid local optimal solutions. 5. **Coefficient post - processing**: - The coefficients of the expressions obtained after pruning may not be accurate enough and there are redundant terms. For this reason, the paper proposes a post - processing algorithm. By optimizing the coefficients and removing redundant terms, the final expression is made more concise and accurate. In summary, the main goal of this paper is to achieve the conversion from complex neural networks to simple, interpretable symbolic expressions through the design of PruneSymNet and its pruning algorithm while maintaining high fitting accuracy.