Descriptor Aided Bayesian Optimization for Many-Level Qualitative Variables with Materials Design Applications

Akshay Iyer,Suraj Yerramilli,James M. Rondinelli,Daniel W. Apley,Wei Chen
DOI: https://doi.org/10.1115/1.4055848
2022-01-01
Journal of Mechanical Design
Abstract:Engineering design often involves qualitative and quantitative design variables, which requires systematic methods for the exploration of these mixed-variable design spaces. Expensive simulation techniques, such as those required to evaluate optimization objectives in materials design applications, constitute the main portion of the cost of the design process and underline the need for efficient search strategies-Bayesian optimization (BO) being one of the most widely adopted. Although recent developments in mixed-variable Bayesian optimization have shown promise, the effects of dimensionality of qualitative variables have not been well studied. High-dimensional qualitative variables, i.e., with many levels, impose a large design cost as they typically require a larger dataset to quantify the effect of each level on the optimization objective. We address this challenge by leveraging domain knowledge about underlying physical descriptors, which embody the physics of the underlying physical phenomena, to infer the effect of unobserved levels that have not been sampled yet. We show that physical descriptors can be intuitively embedded into the latent variable Gaussian process approach-a mixed-variable GP modeling technique-and used to selectively explore levels of qualitative variables in the Bayesian optimization framework. This physics-informed approach is particularly useful when one or more qualitative variables are high dimensional (many-level) and the modeling dataset is small, containing observations for only a subset of levels. Through a combination of mathematical test functions and materials design applications, our method is shown to be robust to certain types of incomplete domain knowledge and significantly reduces the design cost for problems with high-dimensional qualitative variables.
What problem does this paper attempt to address?