Resource Reduction of BFGS Quasi-Newton Implementation on FPGA Using Fixed-Point Matrix Updating

Jia Liu,Qiang Liu
DOI: https://doi.org/10.1109/fpl.2018.00058
2018-01-01
Abstract:Quasi-Newton (QN) methods are now widely used for NN training due to their high effectiveness. In practice, the iterative process of the QN methods implemented in software is often very time-consuming. To accelerate the training process, floating-point BFGS-QN implementation has been realized on FPGA. By analyzing the performance of the BFGS-QN implementation, it is found that updating the inverse of approximate Hessian matrix B is the most computation and memory intensive part. Therefore, a fixed-point hardware design of B matrix updating is proposed in this paper. The fixed-point representation could lead to overflow and underflow during the computation, which degrade the convergence performance of the training process. To address the issues, matrix property checking and precision scaling schemes are proposed, giving a tradeoff between resource and precision. The experimental results show that compared with the single-precision floating-point BFGS-QN, the mixed precision BFGS-QN with fixed-point B matrix updating design achieves up to 10.9% LUTs, 20.2% FFs and 18.1% BRAMs reduction, while the training speed is not satisfied.
What problem does this paper attempt to address?