Abstract:Choosing the most adequate kernel is crucial in many Machine Learning applications. Gaussian Process is a state-of-the-art technique for regression and classification that heavily relies on a kernel function. However, in the Gaussian Process literature, kernels have usually been either ad hoc designed, selected from a predefined set, or searched for in a space of compositions of kernels which have been defined a priori. In this paper, we propose a Genetic-Programming algorithm that represents a kernel function as a tree of elementary mathematical expressions. By means of this representation, a wider set of kernels can be modeled, where potentially better solutions can be found, although new challenges also arise. The proposed algorithm is able to overcome these difficulties and find kernels that accurately model the characteristics of the data. This method has been tested in several real-world time-series extrapolation problems, improving the state-of-the-art results while reducing the complexity of the kernels.

What problem does this paper attempt to address?

### Problems Addressed by the Paper The paper aims to address the issue of kernel function selection in Gaussian Processes (GP). Specifically, the authors propose a method based on Genetic Programming (GP) to evolve suitable kernel functions from basic mathematical expressions. The main objectives of this method are: 1. **Expanding the Search Space of Kernel Functions**: Traditional methods for selecting kernel functions often rely on predefined sets of kernel functions or generate new kernel functions by combining known ones. These methods limit the possible types of kernel functions, potentially leading to suboptimal solutions. The proposed method uses basic mathematical expressions as building blocks, enabling the generation of a wider and more complex range of kernel functions. 2. **Improving Model Accuracy**: By expanding the search space of kernel functions, this method can find kernel functions that more accurately describe the characteristics of the data in practical applications, thereby improving the performance of Gaussian Processes in regression and classification tasks. 3. **Reducing Kernel Function Complexity**: Although the generated kernel functions can be more complex, the method also aims to reduce the complexity of the kernel functions by optimizing and controlling their depth, making them more feasible for practical applications. 4. **Automating Kernel Function Selection**: This method reduces the workload of manually designing and selecting kernel functions through an automated search process, making the application of Gaussian Processes more convenient and efficient. ### Main Contributions - **Proposed a New Genetic Programming Algorithm (EvoCov)**: This algorithm can evolve suitable kernel functions from basic mathematical expressions and overcome the challenge of generating non-positive definite kernel functions. - **Extended the Representational Capacity of Kernel Functions**: By using basic mathematical expressions, this method can generate a wider range of kernel functions, enhancing the representational capacity of Gaussian Processes. - **Validation on Multiple Real-World Problems**: The method was tested on multiple time series extrapolation problems, showing superior performance compared to existing methods and generating kernel functions with lower complexity. ### Method Overview - **Kernel Function Representation**: Kernel functions are represented as tree structures of basic mathematical expressions, with random kernel functions generated recursively. - **Mutation Operations**: Includes crossover and mutation operations to generate new kernel functions. Crossover operations create new kernel functions by combining subtrees of two kernel functions, while mutation operations modify kernel functions through insertion, shrinkage, uniform replacement, and node replacement. - **Depth Control**: By setting a maximum depth limit, the method avoids generating overly complex kernel functions. - **Evaluation Method**: Uses the Bayesian Information Criterion (BIC) as the fitness function, evaluating the performance of each kernel function by optimizing hyperparameters. ### Conclusion The proposed method demonstrates good performance on multiple real-world problems, not only improving the accuracy of Gaussian Processes but also reducing the complexity of kernel functions. It provides a new approach for automated kernel function selection.

Evolving Gaussian Process kernels from elementary mathematical expressions

Sentiment analysis with genetically evolved Gaussian kernels

A Unifying Perspective on Non-Stationary Kernels for Deeper Gaussian Processes

Additive Kernels for Gaussian Process Modeling

Gaussian Process Regression under Computational and Epistemic Misspecification

Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data

Gaussian kernel optimization: Complex problem and a simple solution

Machine Learning Application of Generalized Gaussian Radial Basis Function and Its Reproducing Kernel Theory

Scaling Gaussian Process Regression with Derivatives

A Solution to the Ill-Conditioning of Gradient-Enhanced Covariance Matrices for Gaussian Processes

Global Optimization of Gaussian processes

Compressing spectral kernels in Gaussian Process: Enhanced generalization and interpretability

Gaussian Process Regression in the Flat Limit

Wiener Chaos in Kernel Regression: Towards Untangling Aleatoric and Epistemic Uncertainty

Tunable GMM Kernels

Estimation of Dynamic Gaussian Processes

Vecchia Gaussian Processes: Probabilistic Properties, Minimax Rates and Methodological Developments

GaussianProcesses.jl: A Nonparametric Bayes package for the Julia Language

The GeometricKernels Package: Heat and Matérn Kernels for Geometric Learning on Manifolds, Meshes, and Graphs

Easy representation of multivariate functions with low-dimensional terms via Gaussian process regression kernel design: applications to machine learning of potential energy surfaces and kinetic energy densities from sparse data