Weight decay induced phase transitions in multilayer neural networks

M. Ahr,M. Biehl,E. Schloesser
DOI: https://doi.org/10.48550/arXiv.cond-mat/9901179
1999-01-19
Disordered Systems and Neural Networks
Abstract:We investigate layered neural networks with differentiable activation function and student vectors without normalization constraint by means of equilibrium statistical physics. We consider the learning of perfectly realizable rules and find that the length of student vectors becomes infinite, unless a proper weight decay term is added to the energy. Then, the system undergoes a first order phase transition between states with very long student vectors and states where the lengths are comparable to those of the teacher vectors. Additionally in both configurations there is a phase transition between a specialized and an unspecialized phase. An anti-specialized phase with long student vectors exists in networks with a small number of hidden units.
What problem does this paper attempt to address?