YOLoC: DeploY Large-Scale Neural Network by ROM-based Computing-in-Memory using ResiduaL Branch on a Chip

Yiming Chen,Guodong Yin,Zhanhong Tan,Mingyen Lee,Zekun Yang,Yongpan Liu,Huazhong Yang,Kaisheng Ma,Xueqing Li
DOI: https://doi.org/10.1145/3489517.3530576
2022-06-01
Abstract:Computing-in-memory (CiM) is a promising technique to achieve high energy efficiency in data-intensive matrix-vector multiplication (MVM) by relieving the memory bottleneck. Unfortunately, due to the limited SRAM capacity, existing SRAM-based CiM needs to reload the weights from DRAM in large-scale networks. This undesired fact weakens the energy efficiency significantly. This work, for the first time, proposes the concept, design, and optimization of computing-in-ROM to achieve much higher on-chip memory capacity, and thus less DRAM access and lower energy consumption. Furthermore, to support different computing scenarios with varying weights, a weight fine-tune technique, namely Residual Branch (ReBranch), is also proposed. ReBranch combines ROM-CiM and assisting SRAM-CiM to ahieve high versatility. YOLoC, a ReBranch-assisted ROM-CiM framework for object detection is presented and evaluated. With the same area in 28nm CMOS, YOLoC for several datasets has shown significant energy efficiency improvement by 14.8x for YOLO (Darknet-19) and 4.8x for ResNet-18, with <8% latency overhead and almost no mean average precision (mAP) loss (-0.5% ~ +0.2%), compared with the fully SRAM-based CiM.
Hardware Architecture
What problem does this paper attempt to address?