Abstract:With the advent of autonomous vehicle applications, the importance of LiDAR point cloud 3D object detection cannot be overstated. Recent studies have demonstrated that methods for aggregating features from voxels can accurately and efficiently detect objects in large, complex 3D detection scenes. Nevertheless, most of these methods do not filter background points well and have inferior detection performance for small objects. To ameliorate this issue, this paper proposes an Attention-based and Multiscale Feature Fusion Network (AMFF-Net), which utilizes a Dual-Attention Voxel Feature Extractor (DA-VFE) and a Multi-scale Feature Fusion (MFF) Module to improve the precision and efficiency of 3D object detection. The DA-VFE considers pointwise and channelwise attention and integrates them into the Voxel Feature Extractor (VFE) to enhance key point cloud information in voxels and refine more-representative voxel features. The MFF Module consists of self-calibrated convolutions, a residual structure, and a coordinate attention mechanism, which acts as a 2D Backbone to expand the receptive domain and capture more contextual information, thus better capturing small object locations, enhancing the feature-extraction capability of the network and reducing the computational overhead. We performed evaluations of the proposed model on the nuScenes dataset with a large number of driving scenarios. The experimental results showed that the AMFF-Net achieved 62.8% in the mAP, which significantly boosted the performance of small object detection compared to the baseline network and significantly reduced the computational overhead, while the inference speed remained essentially the same. AMFF-Net also achieved advanced performance on the KITTI dataset.

ARFA: Adaptive Reception Field Aggregation for 3-D Detection from LiDAR Point Cloud

Adaptive Recurrent Forward Network for Dense Point Cloud Completion

FARP-Net: Local-Global Feature Aggregation and Relation-Aware Proposals for 3D Object Detection

PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection

Spatial Information Enhancement with Multi-Scale Feature Aggregation for Long-Range Object and Small Reflective Area Object Detection from Point Cloud

AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion

BAFusion: Bidirectional Attention Fusion for 3D Object Detection Based on LiDAR and Camera

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

Three-Dimensional Point Cloud Object Detection Based on Feature Fusion and Enhancement

ACF-Net: Asymmetric Cascade Fusion for 3D Detection with LiDAR Point Clouds and Images

Multi-View Adaptive Fusion Network for 3D Object Detection

RPFA-Net - a 4D RaDAR Pillar Feature Attention Network for 3D Object Detection.

Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion

FS-Net: LiDAR-Camera Fusion With Matched Scale for 3D Object Detection in Autonomous Driving

AFANet: A Multibackbone Compatible Feature Fusion Framework for Effective Remote Sensing Object Detection

From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network

Adaptive Scale and Spatial Aggregation for Real-Time Object Detection

GAFusion: Adaptive Fusing LiDAR and Camera with Multiple Guidance for 3D Object Detection

MMAF-Net: Multi-view multi-stage adaptive fusion for multi-sensor 3D object detection

FFPA-Net: Efficient Feature Fusion with Projection Awareness for 3D Object Detection

Fully Sparse Fusion for 3D Object Detection