Training Behavior of Deep Neural Network in Frequency Domain.

Zhi-Qin John Xu,Yaoyu Zhang,Yanyang Xiao
DOI: https://doi.org/10.1007/978-3-030-36708-4_22
2019-01-01
Abstract:Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [24]. To find a potential mechanism, we focus on the study of implicit biases underlying the training process of DNNs. In this work, for both real and synthetic datasets, we empirically find that a DNN with common settings first quickly captures the dominant low-frequency components, and then relatively slowly captures the high-frequency ones. We call this phenomenon Frequency Principle (F-Principle). The F-Principle can be observed over DNNs of various structures, activation functions, and training algorithms in our experiments. We also illustrate how the F-Principle helps understand the effect of early-stopping as well as the generalization of DNNs. This F-Principle potentially provides insight into a general principle underlying DNN optimization and generalization.
What problem does this paper attempt to address?