Abstract:In this paper, we investigate improving the perception performance of autonomous vehicles through communication with other vehicles and road infrastructures. To this end, we introduce a novel collaborative perception architecture, called ParCon, which connects multiple modules in parallel, as opposed to the sequential connections used in most other collaborative perception methods. Through extensive experiments, we demonstrate that ParCon inherits the advantages of parallel connection. Specifically, ParCon is robust to noise, as the parallel architecture allows each module to manage noise independently and complement the limitations of other modules. As a result, ParCon achieves state-of-the-art accuracy, particularly in noisy environments, such as real-world datasets, increasing detection accuracy by 6.91%. Additionally, ParCon is computationally efficient, reducing floating-point operations (FLOPs) by 11.46%.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: through vehicle - to - vehicle (V2V) and vehicle - to - infrastructure (V2I) communications, improve the perception performance of autonomous vehicles (AVs) in various environments. Specifically, the paper proposes a new collaborative perception architecture - ParCon, aiming to overcome the noise problems introduced by communication in existing methods, especially the problem that noise will be amplified layer by layer in multi - module serial connections. ### Specific background and challenges of the problem: 1. **Limitations of single - vehicle perception**: The perception system of a single vehicle is significantly limited by occlusion and the limited range of sensors. 2. **Impact of communication noise**: In multi - agent collaborative perception, noise will inevitably be introduced during the communication process. Most of the existing V2X collaborative perception models adopt a serial architecture, which is prone to amplifying the impact of noise, thereby reducing the detection accuracy. 3. **Efficiency problems of existing models**: The existing collaborative perception models are rather redundant in terms of computational resources and the number of parameters, resulting in low efficiency. ### Solutions of ParCon: - **Parallel connection architecture**: ParCon adopts the method of connecting multiple modules in parallel, enabling each module to process noise independently and complement the deficiencies of other modules, thereby improving the robustness to noise. - **Channel Compression Layer (CCL)**: To improve the model efficiency, ParCon introduces the Channel Compression Layer, which reduces the model parameters and the number of floating - point operations (FLOPs) while maintaining high performance. - **Heterogeneous Relative Pose Encoding (HRPE)**: By considering information such as different agent types, relative angles, and distances, HRPE improves the model's robustness to noise and perception accuracy. ### Main contributions: 1. **Proposing the ParCon model**: This is a collaborative perception model based on parallel connection, especially suitable for 3D object detection tasks. ParCon has reached the state - of - the - art level in terms of detection accuracy, computational efficiency, and noise robustness. 2. **Verifying the advantages of parallel connection**: Through experiments, it shows the superiority of parallel connection over serial connection in dealing with communication noise. 3. **Introducing efficient fusion modules**: Including A - Att (agent - level attention), S - Att (space - level attention), and S - Conv (space convolution), and achieving a lightweight design through CCL. ### Experimental results: - **Improvement in detection accuracy**: On the simulated datasets (V2XSet and OPV2V) and the real - world dataset (DAIR - V2X), ParCon has respectively improved the AP@0.7 by 5.83% and 6.91% compared to the existing best models under the mild noise setting. - **Improvement in computational efficiency**: Compared to CoAlign and V2X - ViT, the number of parameters of ParCon has been respectively reduced by 56.95% and 60.69%, and the GFLOPs have been respectively reduced by 11.46% and 70.38%. - **Strong noise robustness**: Under different noise conditions, ParCon has always shown stronger robustness than other models, especially in terms of delay, heading error, and positioning error. In summary, this paper solves the deficiencies of existing collaborative perception models in terms of noise robustness and computational efficiency by proposing and verifying the ParCon model, providing a better solution for the perception system of autonomous vehicles.

ParCon: Noise-Robust Collaborative Perception via Multi-module Parallel Connection

PACP: Priority-Aware Collaborative Perception for Connected and Autonomous Vehicles

Efficient Vehicular Collaborative Perception Based on Saptial-Temporal Feature Compression

Collaborative Perception in Autonomous Driving: Methods, Datasets and Challenges

Collaborative Joint Perception and Prediction for Autonomous Driving

How2comm: Communication-Efficient and Collaboration-Pragmatic Multi-Agent Perception

SiCP: Simultaneous Individual and Cooperative Perception for 3D Object Detection in Connected and Automated Vehicles

Cooper: Cooperative Perception for Connected Autonomous Vehicles based on 3D Point Clouds

Multi-agent Collaborative Perception for Robotic Fleet: A Systematic Review

RSU-Aided Energy-Efficient Collaborative Perception for Connected Autonomous Vehicles

Collaborative Perception-The Missing Piece in Realizing Fully Autonomous Driving

Practical Collaborative Perception: A Framework for Asynchronous and Multi-Agent 3D Object Detection

A Survey and Framework of Cooperative Perception: From Heterogeneous Singleton to Hierarchical Cooperation

Cooperative Infrastructure Perception

Collaborative Perception for Connected and Autonomous Driving: Challenges, Possible Solutions and Opportunities

Multi-agent Collaborative Perception Via Motion-aware Robust Communication Network

BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities

Adaptive Communications in Collaborative Perception with Domain Alignment for Autonomous Driving

EMP: edge-assisted multi-vehicle perception

Cooperative Perception with Deep Reinforcement Learning for Connected Vehicles

Selective Communication for Cooperative Perception in End-to-End Autonomous Driving