Convergence of Gradient Method for A Fully Recurrent Neural Network

Dongpo Xu,Zhengxue Li,Wei Wu
DOI: https://doi.org/10.1007/s00500-009-0398-0
IF: 3.732
2009-01-01
Soft Computing
Abstract:Recurrent neural networks have been successfully used for analysis and prediction of temporal sequences. This paper is concerned with the convergence of a gradient-descent learning algorithm for training a fully recurrent neural network. In literature, stochastic process theory has been used to establish some convergence results of probability nature for the on-line gradient training algorithm, based on the assumption that a very large number of (or infinitely many in theory) training samples of the temporal sequences are available. In this paper, we consider the case that only a limited number of training samples of the temporal sequences are available such that the stochastic treatment of the problem is no longer appropriate. Instead, we use an off-line gradient training algorithm for the fully recurrent neural network, and we accordingly prove some convergence results of deterministic nature. The monotonicity of the error function in the iteration is also guaranteed. A numerical example is given to support the theoretical findings.
What problem does this paper attempt to address?