Abstract:Several attacks have been proposed against autonomous vehicles and their subsystems that are powered by machine learning (ML). Road sign recognition models are especially heavily tested under various adversarial ML attack settings, and they have proven to be vulnerable. Despite the increasing research on adversarial ML attacks against road sign recognition models, there is little to no focus on defending against these attacks. In this paper, we propose the first defense method specifically designed for autonomous vehicles to detect adversarial ML attacks targeting road sign recognition models, which is called ViLAS (Vision-Language Model for Adversarial Traffic Sign Detection). The proposed defense method is based on a custom, fast, lightweight, and salable vision-language model (VLM) and is compatible with any existing traffic sign recognition system. Thanks to the orthogonal information coming from the class label text data through the language model, ViLAS leverages image context in addition to visual data for highly effective attack detection performance. In our extensive experiments, we show that our method consistently detects various attacks against different target models with high true positive rates while satisfying very low false positive rates. When tested against four state-of-the-art attacks targeting four popular action recognition models, our proposed detector achieves an average AUC of 0.94. This result achieves a 25.3% improvement over a state-of-the-art defense method proposed for generic image attack detection, which attains an average AUC of 0.75. We also show that our custom VLM is more suitable for an autonomous vehicle compared to the popular off-the-shelf VLM and CLIP in terms of speed (4.4 vs. 9.3 milliseconds), space complexity (0.36 vs. 1.6 GB), and performance (0.94 vs. 0.43 average AUC).

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to propose a specially - designed defense method in response to adversarial machine learning attacks faced by traffic sign recognition models in autonomous vehicles. Specifically, traffic sign recognition models have been proven to be vulnerable under various adversarial attack settings, but currently there is relatively little research on defense mechanisms against these attacks. Therefore, this paper proposes the first method specifically designed for autonomous vehicles to detect adversarial attacks, named ViLAS (Vision - Language Model for Adversarial Traffic Sign Detection). ### Background and Problem Description of the Paper 1. **Adversarial Machine Learning Attacks**: - Adversarial machine learning attacks add tiny perturbations to the input data, causing machine learning models to make incorrect predictions. These attacks pose a threat to the safety of autonomous vehicles. - The traffic sign recognition model is one of the key modules in autonomous vehicles and is vulnerable to adversarial attacks. 2. **Insufficiencies of Existing Research**: - Although there is a lot of research on adversarial attacks, relatively little research has been done on defense mechanisms for traffic sign recognition models. - Existing defense methods usually rely on adversarial training or image denoising, but these methods have limited effectiveness when facing new attacks. ### Proposed Solution 1. **ViLAS System**: - ViLAS is a defense method based on the Vision - Language Model (VLM) and can be seamlessly integrated with existing traffic sign recognition models. - This method utilizes the orthogonal information provided by text data, combines image context and visual data, and improves the performance of attack detection. 2. **Technical Details**: - **Threat Model**: Suppose there is an image classifier \( G(X) \) specifically used for recognizing traffic signs, and an attacker can deceive this classifier by generating an adversarial version \( X_{\text{adv}} \). - **Detection Process**: - Use a custom - made VLM, which includes an image encoder \( E_I \) and a text encoder \( E_T \). - Calculate the cosine similarity \( S \) between the image embedding vector \( I_X \) and the label embedding vector \( T \). - Apply the softmax function to obtain the probability score \( p_S \). - Calculate the detection score \( \alpha \) as the average of the forward and reverse KL - divergences: \[ \alpha=\frac{1}{2}\left[D_{\text{KL}}(p_S \| p_G)+D_{\text{KL}}(p_G \| p_S)\right] \] - Determine whether the input is an adversarial sample according to the threshold \( h \). 3. **Experimental Results**: - Through extensive experiments, ViLAS performs excellently in detecting different types of adversarial attacks, with an average AUC reaching 0.94, significantly outperforming existing defense methods (such as Denoise, with an average AUC of 0.75). - ViLAS is also superior to existing pre - trained VLMs (such as CLIP) in terms of resource efficiency, scalability, and processing speed. ### Conclusion The ViLAS system proposed in this paper provides an efficient and accurate defense mechanism specifically for adversarial attacks faced by traffic sign recognition models in autonomous vehicles. By combining visual and language information, ViLAS can effectively detect various types of adversarial attacks, thereby improving the safety of autonomous driving systems.

Fast and Lightweight Vision-Language Model for Adversarial Traffic Sign Detection

Adversarial Robustness Analysis of LiDAR-included Models in Autonomous Driving

RoLMA: A Practical Adversarial Attack Against Deep Learning-Based LPR Systems.

A Hybrid Defense Method against Adversarial Attacks on Traffic Sign Classifiers in Autonomous Vehicles

Effective and Efficient Adversarial Detection for Vision-Language Models via A Single Vector

Detection of Adversarial Physical Attacks in Time-Series Image Data

Adversarial Attacks on Traffic Sign Recognition: A Survey

A Hybrid Defense Strategy for Boosting Adversarial Robustness in Vision-Language Models

Adversary ML Resilience in Autonomous Driving Through Human Centered Perception Mechanisms

A defense method based on attention mechanism against traffic sign adversarial samples

MirrorCheck: Efficient Adversarial Defense for Vision-Language Models

Targeted Attention Attack on Deep Learning Models in Road Sign Recognition

Towards Transferable Attacks Against Vision-LLMs in Autonomous Driving with Typography

Targeted Physical-World Attention Attack on Deep Learning Models in Road Sign Recognition

Rogue Signs: Deceiving Traffic Sign Recognition with Malicious Ads and Logos

Invisible Optical Adversarial Stripes on Traffic Sign against Autonomous Vehicles

Moving Target Defense for Deep Visual Sensing against Adversarial Examples

Time Traveling to Defend Against Adversarial Example Attacks in Image Classification

Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors

Multimodal Attack Detection for Action Recognition Models

Learning Image Attacks toward Vision Guided Autonomous Vehicles