A Comparison of Recent Algorithms for Symbolic Regression to Genetic Programming

Yousef A. Radwan,Gabriel Kronberger,Stephan Winkler

2024-06-06

Abstract:Symbolic regression is a machine learning method with the goal to produce interpretable results. Unlike other machine learning methods such as, e.g. random forests or neural networks, which are opaque, symbolic regression aims to model and map data in a way that can be understood by scientists. Recent advancements, have attempted to bridge the gap between these two fields; new methodologies attempt to fuse the mapping power of neural networks and deep learning techniques with the explanatory power of symbolic regression. In this paper, we examine these new emerging systems and test the performance of an end-to-end transformer model for symbolic regression versus the reigning traditional methods based on genetic programming that have spearheaded symbolic regression throughout the years. We compare these systems on novel datasets to avoid bias to older methods who were improved on well-known benchmark datasets. Our results show that traditional GP methods as implemented e.g., by Operon still remain superior to two recently published symbolic regression methods.

Machine Learning,Artificial Intelligence

What problem does this paper attempt to address?

### Problems the Paper Aims to Solve This paper primarily explores the latest advancements in the field of Symbolic Regression (SR) and compares new deep learning-based methods with traditional genetic programming methods. Specifically: 1. **Understanding New Methods**: Researchers aim to better understand recently developed symbolic regression methods, particularly those based on deep learning, which are designed to generate interpretable models. 2. **Performance Comparison**: The paper experimentally compares the performance of the latest end-to-end Transformer models on symbolic regression tasks with traditional genetic programming-based methods. Novel datasets are used in the experiments to avoid biases towards existing benchmark datasets. 3. **Method Evaluation**: The authors find that although many new methods have shown impressive results in papers, there is a lack of systematic comparison with established algorithms. Therefore, they selected some datasets from practical engineering to evaluate whether the algorithms are overfitting to specific benchmark test sets. In summary, the main goal of this paper is to evaluate and compare the latest symbolic regression methods, especially Transformer-based methods, with traditional genetic programming methods across various real-world datasets. By doing so, the paper aims to provide a more comprehensive understanding of how these new methods perform in practical applications and whether they truly outperform traditional symbolic regression methods.

A Comparison of Recent Algorithms for Symbolic Regression to Genetic Programming

Where are we now? A large benchmark study of recent symbolic regression methods

The Inefficiency of Genetic Programming for Symbolic Regression -- Extended Version

Symbolic Regression Algorithms with Built-in Linear Regression

Differentiable Genetic Programming for High-dimensional Symbolic Regression

Contemporary Symbolic Regression Methods and their Relative Performance

Racing Control Variable Genetic Programming for Symbolic Regression

Interpretable Symbolic Regression for Data Science: Analysis of the 2022 Competition

Symbolic Regression via Control Variable Genetic Programming

Evolvability Degeneration in Multi-Objective Genetic Programming for Symbolic Regression

Elite Bases Regression: A Real-time Algorithm for Symbolic Regression

SymbolicGPT: A Generative Transformer Model for Symbolic Regression

End-to-end symbolic regression with transformers

Deep Generative Symbolic Regression

Genetic Programming Based Symbolic Regression for Analytical Solutions to Differential Equations

An adaptive GP-based memetic algorithm for symbolic regression

A Novel Neural Network-Based Symbolic Regression Method: Neuro-Encoded Expression Programming

Constraining Genetic Symbolic Regression via Semantic Backpropagation

A Functional Analysis Approach to Symbolic Regression

Cluster Analysis of a Symbolic Regression Search Space

Bayesian Symbolic Regression