An adaptive sequential Monte Carlo approach to neural network training

Zhang, Yiming,Hu, Bo
DOI: https://doi.org/10.1109/ICIT.2015.7125328
2015-01-01
Abstract:Sequential Monte Carlo methods (Particle Filters) have been successfully applied to the online training of neural networks. However the generic Particle Filter requires the model noise to be known prior to training. Furthermore, the random walk assumption with which the network weights are modeled by may be problematic as a result of the insufficient knowledge of the model noise. In this paper, the evolution of the network weights are modeled using the Polynomial Prediction Model (PPM) which has been shown to have more predictive power than the random walk. The PPM can generate a whole class of models which can then be used in a modified multimodel version of the Particle Filter based on the Interacting Multiple Model (IMM) to train the neural network. The resulting algorithm generates an estimate of the noise terms which is closer to the true noise in the form of a weighted linear combination of the model noise given in the different models. This means that the algorithm can adapt to unknown model noise. Experiments show that the proposed algorithm offers better performance in training neural networks in the context where we are unable to determine the error terms.
What problem does this paper attempt to address?