Enhancing Symbolic Regression and Universal Physics-Informed Neural Networks with Dimensional Analysis

Lena Podina,Diba Darooneh,Joshveer Grewal,Mohammad Kohandel
2024-11-25
Abstract:We present a new method for enhancing symbolic regression for differential equations via dimensional analysis, specifically Ipsen's and Buckingham pi methods. Since symbolic regression often suffers from high computational costs and overfitting, non-dimensionalizing datasets reduces the number of input variables, simplifies the search space, and ensures that derived equations are physically meaningful. As our main contribution, we integrate Ipsen's method of dimensional analysis with Universal Physics-Informed Neural Networks. We also combine dimensional analysis with the AI Feynman symbolic regression algorithm to show that dimensional analysis significantly improves the accuracy of the recovered equation. The results demonstrate that transforming data into a dimensionless form significantly decreases computation time and improves accuracy of the recovered hidden term. For algebraic equations, using the Buckingham pi theorem reduced complexity, allowing the AI Feynman model to converge faster with fewer data points and lower error rates. For differential equations, Ipsen's method was combined with Universal Physics-Informed Neural Networks (UPINNs) to identify hidden terms more effectively. These findings suggest that integrating dimensional analysis with symbolic regression can significantly lower computational costs, enhance model interpretability, and increase accuracy, providing a robust framework for automated discovery of governing equations in complex systems when data is limited.
Machine Learning
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is how to enhance the efficiency and accuracy of symbolic regression and universal physics - informed neural networks (UPINNs) in solving differential equations through dimensional analysis. Specifically, the paper aims to: 1. **Reduce computational cost and over - fitting**: Symbolic regression usually faces the problems of high computational cost and over - fitting. By making the data set dimensionless, the number of input variables can be reduced, the search space can be simplified, and the derived equations can be ensured to have physical meaning. 2. **Improve the accuracy and efficiency of symbolic regression**: By introducing the Ipsen method and Buckingham π theorem, the paper shows how to significantly improve the accuracy of symbolic regression algorithms (such as AI Feynman) in recovering hidden terms. Especially for algebraic equations, using the Buckingham π theorem reduces complexity, enabling the model to converge faster while reducing the error rate. 3. **Combine UPINNs to identify hidden terms in differential equations**: For differential equations, the paper proposes to combine the Ipsen method with UPINNs to more effectively identify hidden terms. This method not only reduces the number of input variables but also ensures the consistency of expressions and avoids calculations involving high - order or low - order numerical values. 4. **Provide a robust framework for automated discovery of governing equations in complex systems**: When data is limited, this method can significantly reduce computational cost, enhance the interpretability of the model, and improve accuracy, providing a solid foundation for the automated discovery of governing equations in complex systems. ### Main contributions of the paper - **Developed a new pipeline**: Combining the Ipsen method with UPINNs and symbolic regression, achieving the minimum number of variables in hidden terms, thereby improving the efficiency and effectiveness of UPINNs and symbolic regression. - **Proved the effectiveness of dimensional analysis**: Verified through experiments, dimensional analysis does improve the performance of symbolic regression, especially in reducing the number of required data points, shortening the time to identify the correct expression, and improving the accuracy of equation fitting. - **Demonstrated effectiveness for different types of equations**: Whether it is an algebraic equation or a differential equation, this method can significantly improve the results of symbolic regression. ### Markdown representation of formulas All formulas involved in the paper are represented in Markdown format, for example: - Application of Buckingham π theorem: \[ \pi_1=\frac{Ur_2}{Gm_1^2}=\frac{[ML^2T^{-2}]\cdot[L]}{[M^{-1}L^3T^{-2}]\cdot[M]^2} = 1 \] - Dimensionless of differential equations: \[ \frac{d\alpha}{d\tau}=\beta\alpha\left(1 - \frac{\alpha}{\epsilon}\right)+G_2(\alpha) \] These formulas ensure the accuracy and readability of the content, facilitating readers to understand the technical details of the paper.