ErfReLU: Adaptive Activation Function for Deep Neural Network

Ashish Rajanand,Pradeep Singh

2023-06-02

Abstract:Recent research has found that the activation function (AF) selected for adding non-linearity into the output can have a big impact on how effectively deep learning networks perform. Developing activation functions that can adapt simultaneously with learning is a need of time. Researchers recently started developing activation functions that can be trained throughout the learning process, known as trainable, or adaptive activation functions (AAF). Research on AAF that enhance the outcomes is still in its early stages. In this paper, a novel activation function 'ErfReLU' has been developed based on the erf function and ReLU. This function exploits the ReLU and the error function (erf) to its advantage. State of art activation functions like Sigmoid, ReLU, Tanh, and their properties have been briefly explained. Adaptive activation functions like Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf have also been described. Lastly, performance analysis of 9 trainable activation functions along with the proposed one namely Tanhsoft1, Tanhsoft2, Tanhsoft3, TanhLU, SAAF, ErfAct, Pserf, Smish, and Serf has been shown by applying these activation functions in MobileNet, VGG16, and ResNet models on CIFAR-10, MNIST, and FMNIST benchmark datasets.

Neural and Evolutionary Computing,Machine Learning

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are some limitations of existing activation functions in deep neural networks, especially how to improve the performance of deep - learning models and overcome the "dead ReLU" problem in traditional activation functions (such as ReLU). Specifically: 1. **Introduction of Non - linearity**: Traditional linear activation functions are unable to handle non - linear relationships in data. Although commonly - used activation functions such as Sigmoid, Tanh and ReLU introduce non - linearity, they perform poorly or have problems in some cases. 2. **Vanishing Gradient Problem**: Many activation functions are prone to cause the vanishing gradient during the back - propagation process, thus affecting the training effect of the model. 3. **Handling of Negative Value Region**: ReLU outputs zero in the negative value region, which may cause some neurons to "die", that is, no longer respond to the input, and this limits the learning ability of the model. 4. **Adaptive Ability**: Most existing activation functions are fixed and cannot adaptively adjust their shapes and parameters according to the changes of data, thus affecting the generalization ability and expressive ability of the model. For this reason, the author proposes a new adaptive activation function ErfReLU, which combines the advantages of the error function (erf) and ReLU, aiming to solve the above problems. ErfReLU can not only maintain the characteristics of ReLU in the positive value region, but also introduce non - linearity in the negative value region through the error function, avoid neuron "death", and at the same time reduce the vanishing gradient problem. In addition, ErfReLU has fewer parameters and can better meet the needs of different data sets and tasks. In summary, the core objective of this paper is to develop a new activation function that can be adaptively adjusted and perform well in various application scenarios, so as to improve the overall performance of deep neural networks.

ErfReLU: Adaptive Activation Function for Deep Neural Network

ErfReLU: adaptive activation function for deep neural network

EIS - Efficient and Trainable Activation Functions for Better Accuracy and Performance

A Method on Searching Better Activation Functions

ErfAct and Pserf: Non-monotonic Smooth Trainable Activation Functions

An Efficient Asymmetric Nonlinear Activation Function for Deep Neural Networks

Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks

An overview of the activation functions used in deep learning algorithms

APALU: A Trainable, Adaptive Activation Function for Deep Learning Networks

Improving Fault Tolerance for Reliable DNN Using Boundary-Aware Activation

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

Web-aided data set expansion in deep learning: evaluating trainable activation functions in ResNet for improved image classification

Deep Learning Activation Functions: Fixed-Shape, Parametric, Adaptive, Stochastic, Miscellaneous, Non-Standard, Ensemble

Competition-based Adaptive ReLU for Deep Neural Networks

Activation Functions: Comparison of trends in Practice and Research for Deep Learning

Trainable Highly-expressive Activation Functions

Activation Functions: Dive into an optimal activation function

Reproducing Activation Function for Deep Learning

Activation function optimization scheme for image classification

Normalized Activation Function: Toward Better Convergence

Effect of Activation Functions on the Training of Overparametrized Neural Nets