An Analysis of Weight Decay As a Methodology of Reducing Three-Layer Feedforward Artificial Neural Networks for Classification Problems

Mo-Yuen Chow,Teeter, J.
DOI: https://doi.org/10.1109/icnn.1994.374233
1994-01-01
Abstract:The structure of an artificial neural network chosen for a particular application can significantly affect the performance of the network. It is often advantageous or even necessary to choose the appropriate size for a network so that it will function more efficiently and/or provide greater insight into how the network learns the mapping. Weight decay is an attractive tool for reducing oversized networks to appropriate-sized ones. However, researchers have reported contrasting results for the methodology in the past. This paper examines the effectiveness of the conventional weight decay methodology as it applies to classification problems. Training parameters, stability and the effectiveness of the methodology are discussed and analyzed. XOR and AND are used as examples to illustrate the authors' discussion. It is found that for these examples, weight decay can consistently minimize the number of hidden nodes used to learn the mappings with hyperbolic tangent activation functions. Ongoing tests with other binary mappings reveal that the methodology exhibits strong potential for use in more complex applications
What problem does this paper attempt to address?