A Novel Fully Hardware-Implemented SVD Solver Based on Ultra-Parallel BCV Jacobi Algorithm

Tang Hu,Xiangdi Li,Xiao Yu,Songnan Ren,Li Yan,Xuyang Bai,Zhiwei Xu,Shiqiang Zhu
DOI: https://doi.org/10.1109/tcsii.2022.3200750
2022-01-01
IEEE Transactions on Circuits & Systems II Express Briefs
Abstract:Efficient FPGA-based floating-point singular value decomposition (SVD) is challenging for its enormous complexity with the rapid growth of the matrix dimension. Numerous hardware architectures have been proposed to improve the performance of SVD by increasing capacity of computation units, reusing data, and enhancing bandwidth. These designs, however, are not optimum due to their low parallelism, poor data access efficiency, and inferior iterations scheduling. In this express, we propose a block column vector Hestenes-Jacobi (BCV Jacobi) algorithm that decomposes an arbitrary large matrix into several blocks, enhances the access efficiency by customizing the distinctive data structure, and improves the system-level parallelism by simplifying the iteration scheduling. The proposed BCV Jacobi algorithm also achieves better scalability and efficiency. Experimental results show that the performance of the proposed FPGA based SVD processor is superior to other SVD implementations in terms of parallelism, data access efficiency, matrix size, and execution time. When compared with state of the art SVD accelerator engine, the proposed algorithm speeds up the runtime over $2{\times }$ on average.
What problem does this paper attempt to address?