Neil K. Chada,Philip J. Herbert
Abstract:Stochastic gradient methods have been a popular and powerful choice of optimization methods, aimed at minimizing functions. Their advantage lies in the fact that that one approximates the gradient as opposed to using the full Jacobian matrix. One research direction, related to this, has been on the application to infinite-dimensional problems, where one may naturally have a Hilbert space framework. However, there has been limited work done on considering this in a more general setup, such as where the natural framework is that of a Banach space. This article aims to address this by the introduction of a novel stochastic method, the stochastic steepest descent method (SSD). The SSD will follow the spirit of stochastic gradient descent, which utilizes Riesz representation to identify gradients and derivatives. Our choice for using such a method is that it naturally allows one to adopt a Banach space setting, for which recent applications have exploited the benefit of this, such as in PDE-constrained shape optimization. We provide a convergence theory related to this under mild assumptions. Furthermore, we demonstrate the performance of this method on a couple of numerical applications, namely a $p$-Laplacian and an optimal control problem. Our assumptions are verified in these applications.
What problem does this paper attempt to address?
### The problems the paper attempts to solve
The paper aims to solve the problem of robust optimization in Banach space. Specifically, the paper introduces a new stochastic method - Stochastic Steepest Descent Method (SSD) to handle optimization tasks in infinite - dimensional problems. Traditionally, Stochastic Gradient Descent (SGD) has been widely used in Hilbert space, but when the problem is naturally in the more general Banach space framework, existing methods are often not applicable. Therefore, this paper fills this gap in this research area by introducing the SSD method.
### Specific description of the problem
The paper focuses on minimizing the function \(J(u)\) in Banach space \(X\), where \(J(u)\) is defined as:
\[J(u)=\mathbb{E}[j(u, \cdot)] = \int_{\Omega}j(u, \xi)dP(\xi)\]
The goal is to minimize the expected value of \(J(u)\), that is, to find \(u^*\) such that:
\[u^*\in\arg\min\{\mathbb{E}[j(u, \cdot)]:u\in X\}\]
### Why this is an important problem
1. **Robustness**: Minimizing the expected value \(\mathbb{E}[j(u, \cdot)]\) can provide "average - optimal" results, which is more robust than minimizing \(j(u, \xi)\) for a specific \(\xi\) in the deterministic case.
2. **Practical applications**: Such problems are of great significance in many practical scenarios, such as PDE - constrained shape optimization and optimal control problems.
3. **Theoretical challenges**: Optimization in Banach space faces more complex technical challenges than in Hilbert space, because the gradient calculation in Banach space is not intuitive and requires the use of Riesz representation theorem to identify gradients and derivatives.
### Main contributions of the paper
1. **New method**: Introduced the Stochastic Steepest Descent Method (SSD), which is applicable to Banach space and uses the Riesz representation theorem to handle gradients and derivatives.
2. **Convergence theory**: Provided the convergence theory of the SSD method, proved that under certain assumptions, the sequence \(J(u_n)\) generated by the algorithm converges, and \(\liminf_{n\rightarrow\infty}\|J'(u_n)\|_{X^*} = 0\).
3. **Numerical experiments**: Verified the effectiveness of the SSD method through two specific numerical applications, namely the \(p\)-Laplace problem and an optimal control problem.
### Conclusion
This paper provides a new tool - Stochastic Steepest Descent Method (SSD) for robust optimization in Banach space. Through theoretical analysis and numerical experiments, the effectiveness and superiority of this method are proved, especially its performance in handling complex optimization problems. Future research directions can further explore the application of SSD in other practical problems, such as shape optimization and the solution of nonlinear partial differential equations.