Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

Zhiwei Bai,Tao Luo,Zhi-Qin John Xu,Yaoyu Zhang
DOI: https://doi.org/10.4208/csiam-am.so-2023-0020
2024-01-01
Abstract:Understanding the relation between deep and shallow neural networks isextremely important for the theoretical study of deep learning. In this work,we discover an embedding principle in depth that loss landscape of an NN"contains" all critical points of the loss landscapes for shallower NNs. Thekey tool for our discovery is the critical lifting operator proposed in thiswork that maps any critical point of a network to critical manifolds of anydeeper network while preserving the outputs. This principle provides newinsights to many widely observed behaviors of DNNs. Regarding the easy trainingof deep networks, we show that local minimum of an NN can be lifted to strictsaddle points of a deeper NN. Regarding the acceleration effect of batchnormalization, we demonstrate that batch normalization helps avoid the criticalmanifolds lifted from shallower NNs by suppressing layer linearization. We alsoprove that increasing training data shrinks the lifted critical manifolds,which can result in acceleration of training as demonstrated in experiments.Overall, our discovery of the embedding principle in depth uncovers thedepth-wise hierarchical structure of deep learning loss landscape, which servesas a solid foundation for the further study about the role of depth for DNNs.
What problem does this paper attempt to address?