Hardware Implementation and Optimization of Tiny-YOLO Network.

Jing Ma,Li Chen,Zhiyong Gao
DOI: https://doi.org/10.1007/978-981-10-8108-8_21
2017-01-01
Abstract:Convolutional Neural Networks (CNNs) have achieved extraordinary performance in image processing fields. However, CNNs are both computational intensive and memory intensive, making them difficult to be deployed on hardware devices like embedded systems. Although lots of existing work has explored hardware implementation of CNNs, the crucial problem of either inefficient or incomplete still remains. Consequently, in this paper, we propose a design that is highly paralleled to perform efficient computation of CNNs. Furthermore, compared with previous work that rarely takes Fully-Connected (FC) layers into consideration, our work also does well in FC optimization.
What problem does this paper attempt to address?