Bayesian Optimization with High-Dimensional Outputs

Wesley J. Maddox,Maximilian Balandat,Andrew Gordon Wilson,Eytan Bakshy
DOI: https://doi.org/10.48550/arXiv.2106.12997
2021-10-28
Abstract:Bayesian Optimization is a sample-efficient black-box optimization procedure that is typically applied to problems with a small number of independent objectives. However, in practice we often wish to optimize objectives defined over many correlated outcomes (or "tasks"). For example, scientists may want to optimize the coverage of a cell tower network across a dense grid of locations. Similarly, engineers may seek to balance the performance of a robot across dozens of different environments via constrained or robust optimization. However, the Gaussian Process (GP) models typically used as probabilistic surrogates for multi-task Bayesian Optimization scale poorly with the number of outcomes, greatly limiting applicability. We devise an efficient technique for exact multi-task GP sampling that combines exploiting Kronecker structure in the covariance matrices with Matheron's identity, allowing us to perform Bayesian Optimization using exact multi-task GP models with tens of thousands of correlated outputs. In doing so, we achieve substantial improvements in sample efficiency compared to existing approaches that only model aggregate functions of the outcomes. We demonstrate how this unlocks a new class of applications for Bayesian Optimization across a range of tasks in science and engineering, including optimizing interference patterns of an optical interferometer with more than 65,000 outputs.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to efficiently handle tasks with a large number of related outputs in multi - task Bayesian optimization (MTBO). Specifically, when dealing with multi - task Bayesian optimization, the computational complexity of the traditional Gaussian Process (GP) model rises sharply as the number of tasks increases, making it difficult to handle problems with a large number of related outputs in practical applications. For example, when optimizing wireless network signal coverage or robot performance, it is necessary to consider performance in multiple locations or environments simultaneously, which usually involves hundreds or thousands of related output variables. The paper proposes an efficient technique for exact multi - task GP sampling using Matheron's rule and Kronecker structure, enabling Bayesian optimization in the case of tens of thousands of related outputs. This method not only improves sample efficiency but also expands the scope of application of Bayesian optimization in scientific and engineering fields, such as optimizing the interference pattern of an optical interferometer, a task involving more than 65,000 outputs. ### Main Contributions 1. **Efficient Multi - task Gaussian Process Sampling Method**: An exact multi - task Gaussian process sampling method whose time cost is additive rather than multiplicative over the combination of tasks and data points is proposed. 2. **Application of Large - scale Multi - task Bayesian Optimization**: The effectiveness of large - scale multi - task Gaussian process sampling in multi - objective, constrained, and contextual Bayesian optimization problems has been demonstrated through experiments. 3. **Efficient Posterior Sampling of High - order Gaussian Process Models**: An efficient posterior sampling method for the High - Order Gaussian Process (HOGP) model is introduced, enabling it to be used for Bayesian optimization, especially when dealing with high - dimensional outputs such as images. ### Solutions - **Utilizing Kronecker Structure and Matheron's Rule**: By using the Kronecker structure and Matheron's rule, the method proposed in the paper can achieve efficient posterior sampling in multi - task Gaussian processes, reducing the complexity from \(O(n^3t^3)\) to \(O(n^3 + t^3)\), where \(n\) is the number of data points and \(t\) is the number of tasks. - **Extension to High - order Gaussian Process Models**: This method is further extended to high - order Gaussian process models, enabling them to handle more complex multi - dimensional output tasks, such as optimizing the interference pattern of an optical interferometer. ### Experimental Results - **Computational Efficiency**: Experimental results show that the method of multi - task Gaussian process sampling using Matheron's rule is more efficient than traditional distribution sampling methods when dealing with a large number of tasks and data points, especially in the performance on GPU and CPU. - **Optimization Performance**: In multi - objective and large - scale constrained optimization problems, the multi - task Gaussian process model performs better than batch models, especially when dealing with a large number of tasks, which cannot be achieved by traditional methods due to computational resource limitations. In conclusion, by proposing an efficient multi - task Gaussian process sampling method, this paper solves the computational problem of handling a large number of related outputs in multi - task Bayesian optimization, thereby expanding the scope of application of Bayesian optimization in practical applications.