A Note on Conjugate Natural Gradient Training of Multilayer Perceptrons

Ana M. González,José R. Dorronsoro
DOI: https://doi.org/10.1109/IJCNN.2006.246779
2006-10-30
Abstract:Natural gradient has been shown to greatly accelerate on-line multilayer perceptron (MLP) training. It also improves standard batch gradient descent, as it gives a Gauss-Newton approximation to quasi-Newton mean square error minimization, but now it should be slower than other superlinear minimization methods, such as the full Gaussian-quasi Newton method or the less complex but equally effective conjugate gradient descent method. In this work we shall investigate how to use natural gradients in a conjugate gradient setting, showing numerically that when applied to batch MLP learning, they can lead to faster convergence to better minimae than what is achieved by standard euclidean conjugate gradient descent.
What problem does this paper attempt to address?