Ultralow-Latency VLSI Architecture Based on a Linear Approximation Method for Computing Nth Roots of Floating-Point Numbers

Fei Lyu,Xiaoqi Xu,Yu Wang,Yuanyong Luo,Yuxuan Wang,Hongbing Pan
DOI: https://doi.org/10.1109/tcsi.2020.3038417
2020-01-01
IEEE Transactions on Circuits and Systems I Regular Papers
Abstract:State-of-the-art approaches that perform root computations based on the COordinate Rotation Digital Computer (CORDIC) algorithm suffer from high latency in performing multiple iterations. Therefore, root computations based on the CORDIC algorithm cannot meet the strict latency requirements of some applications. In this paper, we propose a methodology for performing Nth root computations on floating-point numbers based on the piecewise linear (PWL) approximation method. The proposed method divides an Nth root computation into several subtasks approximated by the PWL algorithm. It determines the widest segments of the subtasks and the smallest fractional width needed to satisfy the predefined maximum relative error Max_Err r . Our design is coded in Verilog HDL and synthesized under TSMC 40 nm CMOS technology. The synthesized results show that our design can reach the highest frequency of 2.703 GHz with an area consumption of 2608.84 μ m 2 and a power consumption of 2.4476 mW. Compared with one stateof-the-art architecture, our design saves 91.60%, 89.84%, and 63.33% of the area, power, and latency @1.89GHz frequency, respectively, while reducing Max_Err r by 57.30%. In addition, it saves 94.52%, 92.68%, and 73.17% of the area, power, and delay @1.89GHz frequency, respectively, and reduces Max_Err r by 1.65% when compared with the other state-of-the-art design.
What problem does this paper attempt to address?