Developing Novel T-Swish Activation Function in Deep Learning

M. Zulqarnain,Irteza Syed,Rozaida Ghazali,Noor Aida Husaini,Irfan Javid
DOI: https://doi.org/10.1109/ICIT56493.2022.9989151
2022-10-03
Abstract:Activation functions are an essential component of neural network construction. It is used in the hidden layer to achieve the regulation that lies in the learning capability of the model from the training dataset. Activation functions employed in the output layer play a significant part to determine the type of predictions that the model can produce. It is a significant area that has to be studied further to increase the efficacy of a deep neural network. As a result, our research in this area has grown. Several activation functions were established throughout time to remedy the inadequacies of the earlier functions. Over 40 activation functions, some of which are well-known, have been postulated, such as the Tanh, Mish, sigmoid, swish, and Rectified Linear Unit, however not entirely. Using datasets such as CIFAR-100, CIFAR-10, MNIST, and Fashion-MNIST, we developed a new activation function Tunable Swish (T-Swish) that increases enactment on object classification tasks. Then, to validate its applicability, we compared it to well-known activation functions like ReLU, sigmoid, Tanh, and Swish using Network In-Network, ResNet-56 version 1, LeNet-5, and U-Net architecture. The T-Swish function, in particular, enables quick network learning, which is excellent in the Transfer Learning technique.
Computer Science
What problem does this paper attempt to address?