Evolving Deep Neural Networks via Cooperative Coevolution With Backpropagation

Maoguo Gong,Jia Liu,A. K. Qin,Kun Zhao,Kay Chen Tan
DOI: https://doi.org/10.1109/tnnls.2020.2978857
IF: 14.255
2021-01-01
IEEE Transactions on Neural Networks and Learning Systems
Abstract:Deep neural networks (DNNs), characterized by sophisticated architectures capable of learning a hierarchy of feature representations, have achieved remarkable successes in various applications. Learning DNN's parameters is a crucial but challenging task that is commonly resolved by using gradient-based backpropagation (BP) methods. However, BP-based methods suffer from severe initialization sensitivity and proneness to getting trapped into inferior local optima. To address these issues, we propose a DNN learning framework that hybridizes CC-based optimization with BP-based gradient descent, called BPCC, and implement it by devising a computationally efficient CC-based optimization technique dedicated to DNN parameter learning. In BPCC, BP will intermittently execute for multiple training epochs. Whenever the execution of BP in a training epoch cannot sufficiently decrease the training objective function value, CC will kick in to execute by using the parameter values derived by BP as the starting point. The best parameter values obtained by CC will act as the starting point of BP in its next training epoch. In CC-based optimization, the overall parameter learning task is decomposed into many subtasks of learning a small portion of parameters. These subtasks are individually addressed in a cooperative manner. In this article, we treat neurons as basic decomposition units. Furthermore, to reduce the computational cost, we devise a maturity-based subtask selection strategy to selectively solve some subtasks of higher priority. Experimental results demonstrate the superiority of the proposed method over common-practice DNN parameter learning techniques.
computer science, artificial intelligence, theory & methods,engineering, electrical & electronic, hardware & architecture
What problem does this paper attempt to address?