DoE2Vec: Deep-learning Based Features for Exploratory Landscape Analysis

Bas van Stein,Fu Xing Long,Moritz Frenzel,Peter Krause,Markus Gitterle,Thomas Bäck
2023-03-31
Abstract:We propose DoE2Vec, a variational autoencoder (VAE)-based methodology to learn optimization landscape characteristics for downstream meta-learning tasks, e.g., automated selection of optimization algorithms. Principally, using large training data sets generated with a random function generator, DoE2Vec self-learns an informative latent representation for any design of experiments (DoE). Unlike the classical exploratory landscape analysis (ELA) method, our approach does not require any feature engineering and is easily applicable for high dimensional search spaces. For validation, we inspect the quality of latent reconstructions and analyze the latent representations using different experiments. The latent representations not only show promising potentials in identifying similar (cheap-to-evaluate) surrogate functions, but also can significantly boost performances when being used complementary to the classical ELA features in classification tasks.
Optimization and Control,Artificial Intelligence,Machine Learning,Neural and Evolutionary Computing
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve The paper attempts to address the issue of effectively identifying and characterizing the landscape features of optimization problems in the context of the Algorithm Selection Problem (ASP). Specifically, the authors propose a method based on Variational Autoencoder (VAE) called DoE2Vec, which is used to learn the latent representation of the Design of Experiments (DoE) for optimization problems, thereby supporting downstream meta-learning tasks such as automatic selection of optimization algorithms. ### Main Problem Background 1. **Complexity of Black-box Optimization Problems**: - Black-box optimization problems in the real world are often very complex, especially when they are highly nonlinear and require expensive function evaluations. - According to the "No Free Lunch Theorem," there is no single best optimization algorithm that can solve all types of problems. 2. **Algorithm Selection Problem (ASP)**: - Selecting the most efficient optimization algorithm in terms of time and resources is a tedious and challenging task, even with domain knowledge and experience. 3. **Limitations of Existing Exploratory Landscape Analysis (ELA) Methods**: - Although ELA methods can capture the landscape features of optimization problems, they have the following issues: - Many ELA features are highly correlated and redundant. - Some ELA features lack expressive power in distinguishing problem instances. - Since ELA features are manually designed by experts, their feature computation may be biased. - ELA features have weak discriminative power in high-dimensional problems. ### Proposed Method 1. **DoE2Vec Method**: - Uses Variational Autoencoder (VAE) to extract an informative low-dimensional latent representation from the Design of Experiments (DoE). - This method does not require any feature engineering and is suitable for high-dimensional search spaces. - The model is trained by generating a large number of random functions, enabling it to self-learn the latent representation of optimization problems. 2. **Validation and Application**: - The effectiveness of the method is validated by examining the quality of latent reconstructions and the latent representations in different experiments. - The latent representation shows potential not only in identifying similar (easy-to-evaluate) surrogate functions but also in significantly improving the performance of classification tasks when combined with classical ELA features. ### Experimental Results 1. **Reconstruction of Function Landscapes**: - The model's loss function was studied by adjusting the latent space size and KL loss weight. - Results indicate that a larger latent space size and a smaller KL loss weight can improve the model's performance. 2. **Latent Space Representation**: - By projecting the latent spaces of AE and VAE into 2D visualizations, it was found that the latent space representation of VAE is more compact with greater output variance. 3. **Identification of Similar Representative Functions**: - The DoE2Vec model can identify inexpensive evaluation functions with similar latent space representations to a given DoE, which is useful for handling real-world expensive black-box optimization problems. 4. **Classification Tasks**: - Using a random forest model to classify the high-level properties of BBOB functions, results show that the classification performance is significantly improved when DoE2Vec is combined with classical ELA features. ### Conclusion and Future Work 1. **Conclusion**: - The DoE2Vec method can accurately reconstruct a large number of functions and be used for downstream meta-learning tasks such as algorithm selection. - This method can be combined with existing techniques (such as classical ELA features) to further improve the classification accuracy of certain downstream tasks. - DoE2Vec has advantages in learning the latent representation of optimization problems, such as not requiring feature engineering or selection knowledge, not needing expertise in the ELA domain, and being suitable for optimization tasks. 2. **Future Work**: - Improve the method to address known limitations, such as scale.