Depth Separation with Multilayer Mean-Field Networks

Yunwei Ren,Mo Zhou,Rong Ge
2023-04-03
Abstract:Depth separation -- why a deeper network is more powerful than a shallower one -- has been a major problem in deep learning theory. Previous results often focus on representation power. For example, <a class="link-https" data-arxiv-id="1904.06984" href="https://arxiv.org/abs/1904.06984">arXiv:1904.06984</a> constructed a function that is easy to approximate using a 3-layer network but not approximable by any 2-layer network. In this paper, we show that this separation is in fact algorithmic: one can learn the function constructed by <a class="link-https" data-arxiv-id="1904.06984" href="https://arxiv.org/abs/1904.06984">arXiv:1904.06984</a> using an overparameterized network with polynomially many neurons efficiently. Our result relies on a new way of extending the mean-field limit to multilayer networks, and a decomposition of loss that factors out the error introduced by the discretization of infinite-width mean-field networks.
Machine Learning,Optimization and Control
What problem does this paper attempt to address?