Abstract:Effective activation functions introduce non-linear transformations, providing neural networks with stronger fitting capa-bilities, which help them better adapt to real data distributions. Huawei Noah's Lab believes that dynamic activation functions are more suitable than static activation functions for enhancing the non-linear capabilities of neural networks. Tsinghua University's related research also suggests using dynamically adjusted activation functions. Building on the ideas of using fine-tuned activation functions from Tsinghua University and Huawei Noah's Lab, we propose a series-based learnable ac-tivation function called LSLU (Learnable Series Linear Units). This method simplifies deep learning networks while im-proving accuracy. This method introduces learnable parameters {\theta} and {\omega} to control the activation function, adapting it to the current layer's training stage and improving the model's generalization. The principle is to increase non-linearity in each activation layer, boosting the network's overall non-linearity. We evaluate LSLU's performance on CIFAR10, CIFAR100, and specific task datasets (e.g., Silkworm), validating its effectiveness. The convergence behavior of the learnable parameters {\theta} and {\omega}, as well as their effects on generalization, are analyzed. Our empirical results show that LSLU enhances the general-ization ability of the original model in various tasks while speeding up training. In VanillaNet training, parameter {\theta} initially decreases, then increases before stabilizing, while {\omega} shows an opposite trend. Ultimately, LSLU achieves a 3.17% accuracy improvement on CIFAR100 for VanillaNet (Table 3). Codes are available at <a class="link-external link-https" href="https://github.com/vontran2021/Learnable-series-linear-units-LSLU" rel="external noopener nofollow">this https URL</a>.

Developing Novel T-Swish Activation Function in Deep Learning

Smish: A Novel Activation Function for Deep Learning Methods

SwishReLU: A Unified Approach to Activation Functions for Enhanced Deep Neural Networks Performance

Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance

A Non-monotonic Smooth Activation Function

An overview of the activation functions used in deep learning algorithms

Activation function optimization method: Learnable series linear units (LSLUs)

Normalized Activation Function: Toward Better Convergence

Activation function optimization scheme for image classification

Mish: A Self Regularized Non-Monotonic Neural Activation Function

TaLU: A Hybrid Activation Function Combining Tanh and Rectified Linear Unit to Enhance Neural Networks

Swim: A General-Purpose, High-Performing, and Efficient Activation Function for Locomotion Control Tasks

SMU: smooth activation function for deep networks using smoothing maximum technique

Evaluating Model Performance with Hard-Swish Activation Function Adjustments

EIS - Efficient and Trainable Activation Functions for Better Accuracy and Performance

Activation Functions: Dive into an optimal activation function

Zorro: A Flexible and Differentiable Parametric Family of Activation Functions That Extends ReLU and GELU

Stochastic Adaptive Activation Function

ErfReLU: Adaptive Activation Function for Deep Neural Network

Learning Activation Functions for Sparse Neural Networks

Trainable Highly-expressive Activation Functions