Non-uniform Piecewise Linear Activation Functions in Deep Neural Networks

Zezhou Zhu,Yuan Dong
DOI: https://doi.org/10.1109/icpr56361.2022.9956345
2022-01-01
Abstract:Recently, PWLU has been proposed to learn specialized activation functions with straightforward piecewise linear definition and SOTA performance in different vision tasks and neural networks. However, the uniformly distributed intervals strongly limit the flexibility of PWLU, and the definition of PWLU requires the statistic-based realignment method to handle the misalignment between PWLU and input data. This paper proposes a new piecewise linear activation function called Nonuniform Piecewise Linear Unit (N-PWLU). N-PWLU has two advantages to overcome the drawbacks in PWLU. First, non-uniformly distributed intervals are used to increase flexibility. Second, the cumulative definition establishes close connections to the parameters in different intervals, which helps alleviate the misalignment issue. With these advantages, N-PWLU significantly outperforms PWLU, especially with fewer intervals. For example, on ImageNet classification dataset, 4-interval N-PWLU outperforms 4-interval PWLU with 1.15% top-1 accuracy in MobileNet-V3. Besides, the expressivity of 4-interval N-PWLU is compatible with 16-interval PWLU in different datasets and architectures. Fewer intervals simplify the computation of N-PWLU, which is friendly to be deployed on edge devices. We believe that our N-PWLU gets a step further in learning better parametric activation functions.
What problem does this paper attempt to address?