Trainable activation function with differentiable negative side and adaptable rectified point

Kevin Pratama,Dae-Ki Kang
DOI: https://doi.org/10.1007/s10489-020-01885-z
IF: 5.3
2020-10-15
Applied Intelligence
Abstract:Activation function is one of the keys for Artificial Neural Network in learning complex mapping function. A recently proposed activation function called Scaled Exponential Linear Unit (SELU) has a unique characteristic in the ability to automatically normalize its output toward predefined mean and variance. In this paper, we introduce Parametric Scaled Exponential Linear Unit (PSELU), a modification of SELU where the parameters are adaptively learned during the training phase via backpropagation algorithm. We then add further modifications with the aim of having a stronger gradient in the negative part and produce more negative output from our proposed method. Our proposed method is evaluated using various artificial neural network models on diverse image classification tasks. In addition, we also measure the performance of our proposed method in comparison with some other popular activation functions. The evaluation results in this paper present some empirical proof of improvement in term of network generalization performance. Moreover, our experiments strengthen our initial hypothesis regarding the advantages obtained from using our proposed method. Finally, the behavior of the trained parameters along with the training phase dynamic is observed in order to better understand the increased performance of our proposed method
computer science, artificial intelligence
What problem does this paper attempt to address?