ParCon: Noise-Robust Collaborative Perception via Multi-module Parallel Connection

Hyunchul Bae,Minhee Kang,Heejin Ahn
2024-10-14
Abstract:In this paper, we investigate improving the perception performance of autonomous vehicles through communication with other vehicles and road infrastructures. To this end, we introduce a novel collaborative perception architecture, called ParCon, which connects multiple modules in parallel, as opposed to the sequential connections used in most other collaborative perception methods. Through extensive experiments, we demonstrate that ParCon inherits the advantages of parallel connection. Specifically, ParCon is robust to noise, as the parallel architecture allows each module to manage noise independently and complement the limitations of other modules. As a result, ParCon achieves state-of-the-art accuracy, particularly in noisy environments, such as real-world datasets, increasing detection accuracy by 6.91%. Additionally, ParCon is computationally efficient, reducing floating-point operations (FLOPs) by 11.46%.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: through vehicle - to - vehicle (V2V) and vehicle - to - infrastructure (V2I) communications, improve the perception performance of autonomous vehicles (AVs) in various environments. Specifically, the paper proposes a new collaborative perception architecture - ParCon, aiming to overcome the noise problems introduced by communication in existing methods, especially the problem that noise will be amplified layer by layer in multi - module serial connections. ### Specific background and challenges of the problem: 1. **Limitations of single - vehicle perception**: The perception system of a single vehicle is significantly limited by occlusion and the limited range of sensors. 2. **Impact of communication noise**: In multi - agent collaborative perception, noise will inevitably be introduced during the communication process. Most of the existing V2X collaborative perception models adopt a serial architecture, which is prone to amplifying the impact of noise, thereby reducing the detection accuracy. 3. **Efficiency problems of existing models**: The existing collaborative perception models are rather redundant in terms of computational resources and the number of parameters, resulting in low efficiency. ### Solutions of ParCon: - **Parallel connection architecture**: ParCon adopts the method of connecting multiple modules in parallel, enabling each module to process noise independently and complement the deficiencies of other modules, thereby improving the robustness to noise. - **Channel Compression Layer (CCL)**: To improve the model efficiency, ParCon introduces the Channel Compression Layer, which reduces the model parameters and the number of floating - point operations (FLOPs) while maintaining high performance. - **Heterogeneous Relative Pose Encoding (HRPE)**: By considering information such as different agent types, relative angles, and distances, HRPE improves the model's robustness to noise and perception accuracy. ### Main contributions: 1. **Proposing the ParCon model**: This is a collaborative perception model based on parallel connection, especially suitable for 3D object detection tasks. ParCon has reached the state - of - the - art level in terms of detection accuracy, computational efficiency, and noise robustness. 2. **Verifying the advantages of parallel connection**: Through experiments, it shows the superiority of parallel connection over serial connection in dealing with communication noise. 3. **Introducing efficient fusion modules**: Including A - Att (agent - level attention), S - Att (space - level attention), and S - Conv (space convolution), and achieving a lightweight design through CCL. ### Experimental results: - **Improvement in detection accuracy**: On the simulated datasets (V2XSet and OPV2V) and the real - world dataset (DAIR - V2X), ParCon has respectively improved the AP@0.7 by 5.83% and 6.91% compared to the existing best models under the mild noise setting. - **Improvement in computational efficiency**: Compared to CoAlign and V2X - ViT, the number of parameters of ParCon has been respectively reduced by 56.95% and 60.69%, and the GFLOPs have been respectively reduced by 11.46% and 70.38%. - **Strong noise robustness**: Under different noise conditions, ParCon has always shown stronger robustness than other models, especially in terms of delay, heading error, and positioning error. In summary, this paper solves the deficiencies of existing collaborative perception models in terms of noise robustness and computational efficiency by proposing and verifying the ParCon model, providing a better solution for the perception system of autonomous vehicles.