Hardware-Efficient SoftMax Architecture With Bit-Wise Exponentiation and Reciprocal Calculation

Jeongmin Kim,Sungho Kim,Kangjoon Choi,In-Cheol Park
DOI: https://doi.org/10.1109/tcsi.2024.3443270
2024-10-04
IEEE Transactions on Circuits and Systems I Regular Papers
Abstract:The SoftMax function is one of the activation functions used in deep neural networks (DNN) to normalize input values to the range of (0,1). With the advent of DNN models including the Transformer, operations utilizing SoftMax have gained significant attention, and the efficient hardware implementation of such operations has become a prominent issue in hardware realization. Implementing SoftMax often involves exponential and division operations, which can be a significant bottleneck in terms of hardware cost and performance. Various efforts have been made to address this challenge, and this paper introduces a novel approach to efficiently implement SoftMax. In most previous works, the maximum input value is subtracted from all the input values to ensure numerical stability. In the proposed approach, the maximum value is replaced with a different value to reduce the hardware complexity with ensuring numerical stability. Additionally, in exponential operations, simple Look-Up Tables (LUTs) with only one entry each are used for bit-wise calculations, and the reciprocal of the total exponential sum is computed to replace division with multiplication. Applying the proposed methods reduces the computational complexity significantly compared to the previous log-sum-exp approach. As a result, the proposed 8-bit SoftMax accelerator achieves a high operating frequency of 3.12GHz and a high throughput of 25G inputs/s. It also improves area efficiency and power consumption by at least 2 times. From an accuracy perspective, furthermore, it is associated with similar or even better accuracy compared to previous works.
engineering, electrical & electronic
What problem does this paper attempt to address?