Error bounds for deep ReLU networks using the Kolmogorov--Arnold superposition theorem

Hadrien Montanelli,Haizhao Yang
DOI: https://doi.org/10.48550/arXiv.1906.11945
2020-05-20
Abstract:We prove a theorem concerning the approximation of multivariate functions by deep ReLU networks, for which the curse of the dimensionality is lessened. Our theorem is based on a constructive proof of the Kolmogorov--Arnold superposition theorem, and on a subset of multivariate continuous functions whose outer superposition functions can be efficiently approximated by deep ReLU networks.
Numerical Analysis,Machine Learning
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is: **How to approximate multivariate continuous functions through deep ReLU neural networks, thereby alleviating the impact of the curse of dimensionality**. Specifically, the authors focus on how to avoid the rapidly increasing computational cost as the dimension increases when using deep ReLU networks to approximate complex functions in high - dimensional spaces. To this end, they proposed a new theoretical framework and method based on the Kolmogorov - Arnold superposition theorem. ### Main contributions of the paper: 1. **Theoretical proof**: It is proved that for certain sets of multivariate continuous functions, deep ReLU networks can be effectively approximated, and the growth rate of their network depth and size is \(O(\epsilon^{-\log n})\), rather than the traditional \(O(\epsilon^{-n})\). This shows that deep ReLU networks can alleviate the impact of the curse of dimensionality to a certain extent. 2. **Constructive proof**: Through specific construction methods, it is shown how to use deep ReLU networks to approximate the inner and outer layer functions in the Kolmogorov - Arnold superposition theorem. 3. **Error analysis**: A detailed error bound analysis is provided, proving the effectiveness and superiority of the proposed deep ReLU network in approximating multivariate continuous functions. ### Specific problem descriptions: - **Curse of dimensionality**: In high - dimensional spaces, the computational complexity of traditional methods (such as shallow networks or polynomial approximation) will increase exponentially with the increase of dimensions, resulting in computational infeasibility in practical applications. - **Approximation ability**: How to design a deep neural network so that it can efficiently approximate complex multivariate continuous functions in high - dimensional spaces while maintaining a low computational complexity. Through these studies, the authors not only provide theoretical support for understanding the approximation ability of deep ReLU networks but also provide guidance on how to design efficient deep - learning models in practical applications. ### Key formulas: According to the Kolmogorov - Arnold superposition theorem, any continuous function \(f: [0,1]^n\rightarrow\mathbb{R}\) can be decomposed as: \[f(x_1,\ldots,x_n)=\sum_{j = 0}^{2n}\varphi_j\left(\sum_{i = 1}^n\psi_{i,j}(x_i)\right)\] where \(\varphi_j\) and \(\psi_{i,j}\) are continuous functions. The paper further explores how to approximate these inner and outer layer functions with deep ReLU networks. Hope this summary can help you understand the main research content and objectives of this paper. If you have more questions, please feel free to continue asking!