DT-CGRA: Dual-track Coarse-Grained Reconfigurable Architecture for Stream Applications

Xitian Fan,Huimin Li,Wei Cao,Lingli Wang
DOI: https://doi.org/10.1109/fpl.2016.7577309
2016-01-01
Abstract:This paper presents a new type of coarse-grained reconfigurable architecture (CGRA) for the object inference domain in machine learning. The proposed CGRA is optimized for stream processing and a correspondent programming model called dual-track model is proposed. The CGRA is realized in Verilog HDL and implemented in SMIC 55 nm process, with the footprint of 3.79 mm2 and consuming 1.79 W at 500 MHz. To evaluate the performance, eight machine-learning algorithms including HOG, CNN, k-means, PCA, SPM, linear-SVM, Softmax and Joint-Bayesian are selected as benchmarks. These algorithms cover a general machine learning flow in object inference domain: feature extraction, feature selection and inference. The experimental results show that the proposed CGRA can gain 1443× average energy efficiency comparing to the Intel i7-3770 CPU and 7.82× energy efficiency comparing to a high performance FPGA solution [19].
What problem does this paper attempt to address?