High-Precision Method and Architecture for Base-2 Softmax Function in DNN Training.

Yuan Zhang,Lele Peng,Lianghua Quan,Yonggang Zhang,Shubin Zheng,Hui Chen
DOI: https://doi.org/10.1109/tcsi.2023.3277247
2023-01-01
IEEE Transactions on Circuits and Systems I Regular Papers
Abstract:Softmax is a common and complex activation function in Deep Neural Networks (DNN). However, it is a challenge to apply it efficiently in DNN training hardware accelerator. Therefore, we propose a high precision calculation method and architecture based on base-2 softmax, which has low hardware complexity than base- $e$ softmax but can still be useful in DNN training. First, we simplify the hardware implementation complexity of calculating base-2 softmax. Second, we use the base-2 hyperbolic COordinate Rotation Digital Computer (CORDIC) to implement the core computation. Finally, we show that the proposed method can be used in DNN training through experiments. Moreover, with the same order of the magnitude of high precision, our hardware cost is lower than traditional base- $e$ softmax or other alternative design methods. Under TSMC 28nm CMOS technology, an example design of our architecture has the area of $98787.43\mu m^{2}$ and the power consumption of 24.72mW for circuit synthesis at the frequency of 1GHz.
What problem does this paper attempt to address?