A New Constructive Algorithm for Architectural and Functional Adaptation of Artificial Neural Networks.

Md. Monirul Islam,Md. Abdus Sattar,Md. Faijul Amin,Xin Yao,Kazuyuki Murase
DOI: https://doi.org/10.1109/tsmcb.2009.2021849
2009-01-01
Abstract:The generalization ability of artificial neural networks (ANNs) is greatly dependent on their architectures. Constructive algorithms provide an attractive automatic way of determining a near-optimal ANN architecture for a given problem. Several such algorithms have been proposed in the literature and shown their effectiveness. This paper presents a new constructive algorithm (NCA) in automatically determining ANN architectures. Unlike most previous studies on determining ANN architectures, NCA puts emphasis on architectural adaptation and functional adaptation in its architecture determination process. It uses a constructive approach to determine the number of hidden layers in an ANN and of neurons in each hidden layer. To achieve functional adaptation, NCA trains hidden neurons in the ANN by using different training sets that were created by employing a similar concept used in the boosting algorithm. The purpose of using different training sets is to encourage hidden neurons to learn different parts or aspects of the training data so that the ANN can learn the whole training data in a better way. In this paper, the convergence and computational issues of NCA are analytically studied. The computational complexity of NCA is found to be O(W times Pt times tau), where W is the number of weights in the ANN, Pt is the number of training examples, and tau is the number of training epochs. This complexity has the same order as what the backpropagation learning algorithm requires for training a fixed ANN architecture. A set of eight classification and two approximation benchmark problems was used to evaluate the performance of NCA. The experimental results show that NCA can produce ANN architectures with fewer hidden neurons and better generalization ability compared to existing constructive and nonconstructive algorithms.
What problem does this paper attempt to address?