Abstract:Object detection is the central issue of intelligent traffic systems, and recent advancements in single-vehicle lidar-based 3D detection indicate that it can provide accurate position information for intelligent agents to make decisions and plan. Compared with single-vehicle perception, multi-view vehicle-road cooperation perception has fundamental advantages, such as the elimination of blind spots and a broader range of perception, and has become a research hotspot. However, the current perception of cooperation focuses on improving the complexity of fusion while ignoring the fundamental problems caused by the absence of single-view outlines. We propose a multi-view vehicle-road cooperation perception system, vehicle-to-everything cooperative perception (V2X-AHD), in order to enhance the identification capability, particularly for predicting the vehicle's shape. At first, we propose an asymmetric heterogeneous distillation network fed with different training data to improve the accuracy of contour recognition, with multi-view teacher features transferring to single-view student features. While the point cloud data are sparse, we propose Spara Pillar, a spare convolutional-based plug-in feature extraction backbone, to reduce the number of parameters and improve and enhance feature extraction capabilities. Moreover, we leverage the multi-head self-attention (MSA) to fuse the single-view feature, and the lightweight design makes the fusion feature a smooth expression. The results of applying our algorithm to the massive open dataset V2Xset demonstrate that our method achieves the state-of-the-art result. The V2X-AHD can effectively improve the accuracy of 3D object detection and reduce the number of network parameters, according to this study, which serves as a benchmark for cooperative perception. The code for this article is available at https://github.com/feeling0414-lab/V2X-AHD.

Vehicle-Infrastructure Cooperative 3D Object Detection via Feature Flow Prediction

Flow-Based Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

VEHICLE-INFRASTRUCTURE COOPERATIVE 3D DETECTION VIA FEATURE FLOW PREDICTION

EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

CoFF: Cooperative Spatial Feature Fusion for 3D Object Detection on Autonomous Vehicles

Multistage Fusion Approach of Lidar and Camera for Vehicle-Infrastructure Cooperative Object Detection

Occlusion-Guided Multi-Modal Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

CenterCoop: Center-Based Feature Aggregation for Communication-Efficient Vehicle-Infrastructure Cooperative 3D Object Detection

A Multi-view 3D Vehicle Detection Method Based On Novel 3D Proposal Generation Method

CoFormerNet: A Transformer-Based Fusion Approach for Enhanced Vehicle-Infrastructure Cooperative Perception

Cooperative LIDAR Object Detection Via Feature Sharing in Deep Networks

Cooperative Perception for 3D Object Detection in Driving Scenarios using Infrastructure Sensors

Multifeature Fusion-Based Object Detection for Intelligent Transportation Systems

Enhancing 3D object detection through multi-modal fusion for cooperative perception

V2X-AHD:Vehicle-to-Everything Cooperation Perception via Asymmetric Heterogenous Distillation Network

Deep LiDAR-Radar-Visual Fusion for Object Detection in Urban Environments

PAFNet: Pillar Attention Fusion Network for Vehicle–Infrastructure Cooperative Target Detection Using LiDAR

MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection

ViT-FuseNet: MultiModal Fusion of Vision Transformer for Vehicle-Infrastructure Cooperative Perception

AMFF-Net: An Effective 3D Object Detector Based on Attention and Multi-Scale Feature Fusion

HEAD: A Bandwidth-Efficient Cooperative Perception Approach for Heterogeneous Connected and Autonomous Vehicles