Abstract:In this work, we examine the approximation capabilities of deep neural networks utilizing the Rectified Quadratic Unit (ReQU) activation function, defined as \(\max(0,x)^2\), for approximating Hölder-regular functions with respect to the uniform norm. We constructively prove that deep neural networks with ReQU activation can approximate any function within the \(R\)-ball of \(r\)-Hölder-regular functions (\(\mathcal{H}^{r, R}([-1,1]^d)\)) up to any accuracy \(\epsilon \) with at most \(\mathcal{O}\left(\epsilon^{-d /2r}\right)\) neurons and fixed number of layers. This result highlights that the effectiveness of the approximation depends significantly on the smoothness of the target function and the characteristics of the ReQU activation function. Our proof is based on approximating local Taylor expansions with deep ReQU neural networks, demonstrating their ability to capture the behavior of Hölder-regular functions effectively. Furthermore, the results can be straightforwardly generalized to any Rectified Power Unit (RePU) activation function of the form \(\max(0,x)^p\) for \(p \geq 2\), indicating the broader applicability of our findings within this family of activations.

What problem does this paper attempt to address?

The core problem that this paper attempts to solve is to explore the ability of deep neural networks with the Rectified Quadratic Unit (ReQU) activation function to uniformly approximate Hölder - regular functions. Specifically, the author aims to prove that such neural networks can approximate r - Hölder - regular functions \( H_{r,R}([−1,1]^d) \) within the unit ball with arbitrary precision, and give the theoretical upper limits of the required number of neurons and the number of layers. ### Main problems and goals 1. **Approximation ability**: Research on the uniform approximation ability of deep neural networks with the ReQU activation function to Hölder - regular functions. 2. **Theoretical boundaries**: Determine the maximum number of neurons \( O(\epsilon^{-d/2r}) \) and a fixed number of layers required to achieve a given precision \( \epsilon \). 3. **Smoothness dependence**: Prove that the effectiveness of the approximation significantly depends on the smoothness of the target function and the characteristics of the ReQU activation function. 4. **Generalized applicability**: The results can be generalized to any Rectified Power Unit (RePU) activation function of the form \( \max(0,x)^p \), where \( p \geq 2 \). ### Research methods The author shows by constructive proof how to approximate the local Taylor expansion with a deep ReQU neural network, thereby effectively capturing the behavior of Hölder - regular functions. In addition, they also analyze the relationships between the depth, width, and total number of weights of the neural network and the approximation error. ### Key formulas - Definition of the ReQU activation function: \[ \rho_2(x) = \max(0, x)^2 \] - Approximation error bound: \[ \|\Phi_f - f\|_{L^\infty([-1,1]^d)} \leq \epsilon \] where \( \Phi_f \) is a ReQU neural network, satisfying: \[ L(\Phi_f) = \left\lfloor \log_2(\left\lfloor r \right\rfloor) \right\rfloor + 2 \left\lfloor \log_2(d+1+d\left\lfloor \log_2(\left\lfloor r \right\rfloor) \right\rfloor) \right\rfloor + 8 \] \[ N(\Phi_f) = 2^d \left( \max \left( \left(1 + \binom{d+\left\lfloor r \right\rfloor}{d}\right) M^d \max(4, 2d+1) + 2, 2 \binom{d+\left\lfloor r \right\rfloor}{d} (d+1+d\left\lfloor \log_2(\left\lfloor r \right\rfloor) \right\rfloor) \right) + 2(M^d (2d+1) + 2d + 2dM^d) + 2 + M^d \max(4, 2d+1) \right) \] ### Conclusion This research proves that the ReQU neural network has superior performance in approximating Hölder - regular functions and provides specific theoretical boundaries. This not only extends the existing approximation theory of deep neural networks but also lays the foundation for further exploration of other types of activation functions and their performance in different function spaces.

Uniform Approximation with Quadratic Neural Networks

Rates of Approximation by ReLU Shallow Neural Networks

Solving parametric partial differential equations with deep rectified quadratic unit neural networks

Neural networks with ReLU powers need less depth

Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality in Approximation on Hölder Class.

Low dimensional approximation and generalization of multivariate functions on smooth manifolds using deep ReLU neural networks

Optimal Approximation Rates for Deep ReLU Neural Networks on Sobolev and Besov Spaces

Why Deep Neural Networks for Function Approximation?

Smooth Function Approximation by Deep Neural Networks with General Activation Functions

Approximation in $L^p(μ)$ with deep ReLU neural networks

Approximation of functions from Korobov spaces by shallow neural networks

Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function Spaces

Implicit Hypersurface Approximation Capacity in Deep ReLU Networks

On the optimal approximation of Sobolev and Besov functions using deep ReLU neural networks

Approximation and interpolation of deep neural networks

Approximation Rates for Shallow ReLU$^k$ Neural Networks on Sobolev Spaces via the Radon Transform

Error bounds for approximations with deep ReLU neural networks in $W^{s,p}$ norms

Optimal rates of approximation by shallow ReLU$^k$ neural networks and applications to nonparametric regression

Differentiable Neural Networks with RePU Activation: with Applications to Score Estimation and Isotonic Regression

Optimal Rates of Approximation by Shallow ReLU Neural Networks and Applications to Nonparametric Regression

Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons