L4Net: an Anchor‐free Generic Object Detector with Attention Mechanism for Autonomous Driving

Yanan Wu,Songhe Feng,Xiankai Huang,Zizhang Wu
DOI: https://doi.org/10.1049/cvi2.12015
IF: 1.484
2021-01-01
IET Computer Vision
Abstract:Generic object detection is a crucial task for autonomous driving. To devise a safe and efficient object detector, the following aspects are required to be considered: high accuracy, real-time inference speed and small model size. Herein, a simple yet effective anchor-free object detector named L4Net is proposed, which incorporates a keypoint detection backbone and a co-attention scheme into a unified framework, and achieves lower computation cost with higher detection accuracy than prior art across a wide spectrum of resource constrains. Specifically, the backbone utilizes Multi-scale Receptive-fields Enhancement module (MRE) to capture context-wise information, where the features of object scale and shape invariance are simultaneously considered. The co-attention scheme integrates the strength of both Class-agnostic Attention (CA) and Semantic Attention (SA), and explores the valuable features from low-level to high-level to generate more accurate prediction boxes. Compared with previous feature fusion strategy, multi-scale features are selectively integrated by fully exploiting the different characteristics of low-level and highlevel features, which leads to a small model size and faster inference speed. Extensive experiments on four well-known datasets demonstrate the effectiveness of our method. For instance, L4Net achieves 71.68% mAP on KITTI test set, with 13.7 M model size at the speed of 149 FPS on NVIDIA TX and 30.7 FPS on Qualcomm-based device, respectively, which is 4x smaller and 2x faster than baseline model.
What problem does this paper attempt to address?