Base-2 Softmax Function: Suitability for Training and Efficient Hardware Implementation

Yuan Zhang,Yonggang Zhang,Lele Peng,Lianghua Quan,Shubin Zheng,Zhonghai Lu,Hui Chen
DOI: https://doi.org/10.1109/tcsi.2022.3175534
2022-01-01
Abstract:The softmax function is widely used in deep neural networks (DNNs), its hardware performance plays an important role in the training and inference of DNN accelerators. However, due to the complexity of the traditional softmax, the existing hardware architectures are resource-consuming or have low precision. In order to address the challenges, we study a base-2 softmax function in terms of its suitability for neural network training and efficient hardware implementation. Compared to the classical base- $e$ softmax function, the base-2 softmax function is a new softmax function that uses 2 as the exponential base instead of $e$ . From the aspects of mathematical derivation and software simulation, we first demonstrate the feasibility and good accuracy of the base-2 softmax function in the application of neural network training. Then, we use the symmetric-mapping lookup table (SM-LUT) method to design a low-complexity architecture but with high precision to implement it. Under TSMC 28nm CMOS technology, an example design of our architecture has the area of $5676 ~\mu m^{2}$ and the power consumption of 13.12 mW for circuit synthesis at the frequency of 3 GHz. Compared with the latest works, our architecture achieves the best performance and efficiency.
What problem does this paper attempt to address?