A Low-Latency Power Series Approximate Computing and Architecture for Co-Calculation of Division and Square Root
Dian Tian,Ningmei Yu,Minghui Xie,Jiahao Tang,Zhuang Feng,Álvaro Hernández,Jesús Ureña
DOI: https://doi.org/10.1109/tcsi.2024.3368102
2024-01-01
Abstract:The calculation of division and square root is widely used in edge computing related to image processing, clustering, recognition, and reconstruction, among others. Their common multi-stage serial calculation takes longer, and requires a higher latency, and more redundant hardware resources for multi-dimensional parallel computations. This work proposes a low-latency power series approximate digital computing (PSADIC) paradigm and architecture, which achieves a fast low-latency calculation of multi-dimensional mathematical expressions, focusing on the co-calculation of division and square root. This approach allows to compute not only the division and the square root, but also the inverse and the inverse square root. Moreover, serial and pipelined architectures have been designed here to achieve smaller areas and lower latencies, respectively. Compared to the multi-stage calculation with CORDIC (COordinate Rotation DIgital Computer), the mean relative error distance (MRED) is reduced by 67% in PSADIC. Under TSMC (Taiwan Semiconductor Manufacturing Company) 40nm CMOS technology, the proposed serial and pipelined architectures achieve a 66.67% overall latency reduction compared to the CORDIC when using both its own maximum clock frequency, and a 75% cycle latency reduction. Meanwhile, both the ADP (Area Delay Product) and PDP (Power Delay Product) are also optimized.
engineering, electrical & electronic