Spurious Local Minima of Deep ReLU Neural Networks in the Neural Tangent Kernel Regime

Tohru Nitta
DOI: https://doi.org/10.48550/arXiv.1806.04884
2022-05-19
Abstract:In this paper, we theoretically prove that the deep ReLU neural networks do not lie in spurious local minima in the loss landscape under the Neural Tangent Kernel (NTK) regime, that is, in the gradient descent training dynamics of the deep ReLU neural networks whose parameters are initialized by a normal distribution in the limit as the widths of the hidden layers tend to infinity.
Machine Learning
What problem does this paper attempt to address?