CNN-based Feature-point Extraction for Real-time Visual SLAM on Embedded FPGA

Zhilin Xu,Jincheng Yu,Chao Yu,Hao Shen,Yu Wang,Huazhong Yang
DOI: https://doi.org/10.1109/fccm48280.2020.00014
2020-01-01
Abstract:Feature-point extraction is a fundamental step in many applications, such as image matching and Simultaneous Localization and Mapping (SLAM). The CNN-based feature-point extraction methods have made significant signs of progress in both feature-point detection and descriptor generation compared with handcrafted processes. However, the computational and storage complexity makes it difficult for CNN to run on real-time embedded systems. In this paper, we aim to deploy the advanced CNN-based feature-point extraction methods onto real-time embedded FPGA systems. We optimize the softmax data flow so that the computation of softmax and NMS can be reduced by 64×. We generate the normalized descriptors after picking the feature-points with the highest confidence so that the computation cost of normalization is reduced by 1500×. We use fixed-point in both of the CNN backbone and the postprocessing operations, and implement them on the ZCU102 FPGA platform. The experimental results show that our proposed hardware-software co-design CNN-based feature-point extraction method outperforms the handcrafted techniques. Our feature-point extraction on the embedded platform runs at the speed of 20 fps, meeting the real-time requirement.
What problem does this paper attempt to address?