Abstract:For a deep learning model, the network architecture is crucial as a model with inappropriate architecture often suffers from performance degradation or parameter redundancy. However, it is experiential and difficult to find the appropriate architecture for a certain application. To tackle this problem, we propose a novel deep learning model with dynamic architecture, named self-growing binary activation network (SGBAN), which can extend the design of a fully connected network (FCN) progressively, resulting in a more compact architecture with higher performance on a certain task. This constructing process is more efficient than neural architecture search methods that train mass of networks to search for the optimal one. Concretely, the training technique of SGBAN is based on the function-preserving transformations that can expand the architecture and combine the information in the new data without neglecting the knowledge learned in the previous steps. The experimental results on four different classification tasks, i.e., Iris, MNIST, CIFAR-10, and CIFAR-100, demonstrate the effectiveness of SGBAN. On the one hand, SGBAN achieves competitive accuracy when compared with the FCN composed of the same architecture, which indicates that the new training technique has the equivalent optimization ability as the traditional optimization methods. On the other hand, the architecture generated by SGBAN achieves 0.59% improvements of accuracy, with only 33.44% parameters when compared with the FCNs composed of manual design architectures, i.e., 500+150 hidden units, on MNIST. Furthermore, we demonstrate that replacing the fully connected layers of the well-trained VGG-19 with SGBAN can gain a slightly improved performance with less than 1% parameters on all these tasks. Finally, we show that the proposed method can conduct the incremental learning tasks and outperform the three outstanding incremental learning methods, i.e., learning without forgetting, elastic weight consolidation, and gradient episodic memory, on both the incremental learning tasks on Disjoint MNIST and Disjoint CIFAR-10.

Bi-firing Deep Neural Networks

Adaptive Multi-Level Firing for Direct Training Deep Spiking Neural Networks

Auto-encoder Using the Bi-Firing Activation Function

Normalized Activation Function: Toward Better Convergence

Activation Adaptation in Neural Networks

Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

IM-LIF: Improved Neuronal Dynamics with Attention Mechanism for Direct Training Deep Spiking Neural Network

Deep neural networks with data dependent implicit activation function

Neuro-Inspired Deep Neural Networks with Sparse, Strong Activations

Self-Growing Binary Activation Network: A Novel Deep Learning Model with Dynamic Architecture

An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks

Activated Gradients for Deep Neural Networks

Smish: A Novel Activation Function for Deep Learning Methods

Deep Learning with Data Dependent Implicit Activation Function.

Your Network May Need to Be Rewritten: Network Adversarial Based on High-Dimensional Function Graph Decomposition

Activation Ensembles for Deep Neural Networks

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

Neural Networks with Activation Networks

Hybrid deep additive neural networks