Abstract:Numerous engineering problems of interest to the industry are often characterized by expensive black-box objective experiments or computer simulations. Obtaining insight into the problem or performing subsequent optimizations requires hundreds of thousands of evaluations of the objective function which is most often a practically unachievable task. Gaussian Process (GP) surrogate modeling replaces the expensive function with a cheap-to-evaluate data-driven probabilistic model. While the GP does not assume a functional form of the problem, it is defined by a set of parameters, called hyperparameters. The hyperparameters define the characteristics of the objective function, such as smoothness, magnitude, periodicity, etc. Accurately estimating these hyperparameters is a key ingredient in developing a reliable and generalizable surrogate model. Markov chain Monte Carlo (MCMC) is a ubiquitously used Bayesian method to estimate these hyperparameters. At the GE Global Research Center, a customized industry-strength Bayesian hybrid modeling framework utilizing the GP, called GEBHM, has been employed and validated over many years. GEBHM is very effective on problems of small and medium size, typically less than 1000 training points. However, the GP does not scale well in time with a growing dataset and problem dimensionality which can be a major impediment in such problems. In this work, we extend and implement in GEBHM an Adaptive Sequential Monte Carlo (ASMC) methodology for training the GP enabling the modeling of large-scale industry problems. This implementation saves computational time (especially for large-scale problems) while not sacrificing predictability over the current MCMC implementation. We demonstrate the effectiveness and accuracy of GEBHM with ASMC on four mathematical problems and on two challenging industry applications of varying complexity.

Remarks for Scaling Up a General Gaussian Process to Model Large Dataset with Sub-models

Towards Scalable Gaussian Process Modeling

Active Design of Dynamic GP Models for Model Predictive Control Using Expected Improvement

Scalable Fully Bayesian Gaussian Process Modeling and Calibration With Adaptive Sequential Monte Carlo for Industrial Applications

Data-Driven Design via Scalable Gaussian Processes for Multi-Response Big Data With Qualitative Factors

Understanding and comparing scalable Gaussian process regression for big data

When Gaussian Process Meets Big Data: A Review of Scalable GPs

ProSpar-GP: scalable Gaussian process modeling with massive non-stationary datasets

Scalable Gaussian Processes for Data-Driven Design Using Big Data With Categorical Factors

Towards Efficient Modeling and Inference in Multi-Dimensional Gaussian Process State-Space Models

Gaussian Process Emulation For Big Data In Data-Driven Metamaterials Design

Scaling Gaussian Process Regression with Derivatives

A segregated genetic programming for bioprocess modelling with outliers

Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data

Additive Kernels for Gaussian Process Modeling

Representing Additive Gaussian Processes by Sparse Matrices

A STRATEGY FOR ADAPTIVE SAMPLING OF MULTI-FIDELITY GAUSSIAN PROCESSES TO REDUCE PREDICTIVE UNCERTAINTY

H-GPR: A HYBRID STRATEGY FOR LARGE-SCALE GAUSSIAN PROCESS REGRESSION

Globally Approximate Gaussian Processes for Big Data with Application to Data-Driven Metamaterials Design (IDETC2019-98027)

A Global-Local Approximation Framework for Large-Scale Gaussian Process Modeling

Modulating Scalable Gaussian Processes for Expressive Statistical Learning