Performance estimation for the memristor-based computing-in-memory implementation of extremely factorized network for real-time and low-power semantic segmentation

Shuai Dong,Zhen Fan,Yihong Chen,Kaihui Chen,Minghui Qin,Min Zeng,Xubing Lu,Guofu Zhou,Xingsen Gao,Jun-Ming Liu
DOI: https://doi.org/10.1016/j.neunet.2023.01.008
Abstract:Nowadays many semantic segmentation algorithms have achieved satisfactory accuracy on von Neumann platforms (e.g., GPU), but the speed and energy consumption have not meet the high requirements of certain edge applications like autonomous driving. To tackle this issue, it is of necessity to design an efficient lightweight semantic segmentation algorithm and then implement it on emerging hardware platforms with high speed and energy efficiency. Here, we first propose an extremely factorized network (EFNet) which can learn multi-scale context information while preserving rich spatial information with reduced model complexity. Experimental results on the Cityscapes dataset show that EFNet achieves an accuracy of 68.0% mean intersection over union (mIoU) with only 0.18M parameters, at a speed of 99 frames per second (FPS) on a single RTX 3090 GPU. Then, to further improve the speed and energy efficiency, we design a memristor-based computing-in-memory (CIM) accelerator for the hardware implementation of EFNet. It is shown by the simulation in DNN+NeuroSim V2.0 that the memristor-based CIM accelerator is ∼63× (∼4.6×) smaller in area, at most ∼9.2× (∼1000×) faster, and ∼470× (∼2400×) more energy-efficient than the RTX 3090 GPU (the Jetson Nano embedded development board), although its accuracy slightly decreases by 1.7% mIoU. Therefore, the memristor-based CIM accelerator has great potential to be deployed at the edge to implement lightweight semantic segmentation models like EFNet. This study showcases an algorithm-hardware co-design to realize real-time and low-power semantic segmentation at the edge.
What problem does this paper attempt to address?