Abstract:Generalization theory has been established for sparse deep neural networks under high-dimensional regime. Beyond generalization, parameter estimation is also important since it is crucial for variable selection and interpretability of deep neural networks. Current theoretical studies concerning parameter estimation mainly focus on two-layer neural networks, which is due to the fact that the convergence of parameter estimation heavily relies on the regularity of the Hessian matrix, while the Hessian matrix of deep neural networks is highly singular. To avoid the unidentifiability of deep neural networks in parameter estimation, we propose to conduct nonparametric estimation of partial derivatives with respect to inputs. We first show that model convergence of sparse deep neural networks is guaranteed in that the sample complexity only grows with the logarithm of the number of parameters or the input dimension when the $\ell_{1}$-norm of parameters is well constrained. Then by bounding the norm and the divergence of partial derivatives, we establish that the convergence rate of nonparametric estimation of partial derivatives scales as $\mathcal{O}(n^{-1/4})$, a rate which is slower than the model convergence rate $\mathcal{O}(n^{-1/2})$. To the best of our knowledge, this study combines nonparametric estimation and parametric sparse deep neural networks for the first time. As nonparametric estimation of partial derivatives is of great significance for nonlinear variable selection, the current results show the promising future for the interpretability of deep neural networks.

Nonparametric regression using deep neural networks with ReLU activation function

Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function

Nonparametric regression using over-parameterized shallow ReLU neural networks

Robust nonparametric regression based on deep ReLU neural networks

Optimal rates of approximation by shallow ReLU$^k$ neural networks and applications to nonparametric regression

Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks

Neural networks with ReLU powers need less depth

Low dimensional approximation and generalization of multivariate functions on smooth manifolds using deep ReLU neural networks

Optimal Rates of Approximation by Shallow ReLU Neural Networks and Applications to Nonparametric Regression

Dense ReLU Neural Networks for Temporal-spatial Model

Sparse deep neural networks for nonparametric estimation in high-dimensional sparse regression

ReLUs Are Sufficient for Learning Implicit Neural Representations

On the Banach Spaces Associated with Multi-Layer ReLU Networks: Function Representation, Approximation Theory and Gradient Descent Dynamics

A comparison of deep networks with ReLU activation function and linear spline-type methods

Estimation of the Mean Function of Functional Data via Deep Neural Networks

Sparse-Input Neural Networks for High-dimensional Nonparametric Regression and Classification

Nonparametric regression for repeated measurements with deep neural networks

Deep Nonparametric Regression on Approximate Manifolds: Non-Asymptotic Error Bounds with Polynomial Prefactors

Robust Nonparametric Regression with Deep Neural Networks

Minimax optimality of deep neural networks on dependent data via PAC-Bayes bounds

Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?