Abstract:An automatic speech recognition (ASR) system based on a deep neural network is vulnerable to attack by an adversarial example, especially if the command-dependent ASR fails. A defense method against adversarial examples is proposed to improve the robustness and security of the ASR system. We propose an algorithm of devastation and detection on adversarial examples that can attack current advanced ASR systems. We choose an advanced text- and command-dependent ASR system as our target, generating adversarial examples by an optimization-based attack on text-dependent ASR and the GA-based algorithm on command-dependent ASR. The method is based on input transformation of adversarial examples. Different random intensities and kinds of noise are added to adversarial examples to devastate the perturbation previously added to normal examples. Experimental results show that the method performs well. For the devastation of examples, the original speech similarity after adding noise can reach 99.68%, the similarity of adversarial examples can reach zero, and the detection rate of adversarial examples can reach 94%.

What problem does this paper attempt to address?

The paper attempts to address the security and robustness issues of Automatic Speech Recognition (ASR) systems when faced with adversarial sample attacks. Specifically, ASR systems based on deep neural networks are susceptible to adversarial samples, especially in command-dependent ASR systems. To enhance the security and robustness of ASR systems, the paper proposes a method to disrupt and detect adversarial samples by adding random noise. ### Main Research Content 1. **Generation of Adversarial Samples**: - The paper uses two methods to generate adversarial samples: - **Optimization Method (OPT)**: Based on gradient optimization, suitable for text-dependent ASR systems. - **Genetic Algorithm (GA)**: Suitable for command-dependent ASR systems. 2. **Disruption of Adversarial Samples**: - A method is proposed to disrupt adversarial samples by adding random noise to the input signal. The specific steps include: - Generate adversarial samples \( x^* = x + \delta^* \). - Add Gaussian noise to the adversarial samples \( \hat{x}^* = x^* + \hat{\delta} \). - By adjusting the intensity of the noise, the perturbation of the adversarial samples loses specificity, thereby losing its attack capability. 3. **Detection of Adversarial Samples**: - Based on the disruption method, a strategy for detecting adversarial samples is proposed. The specific steps include: - Add random noise \( \hat{\delta} \) to the input sample \( x \). - Compare the change rate (CR) of the recognition results before and after adding noise. - If the change rate exceeds a certain threshold \( K \), the sample is determined to be an adversarial sample. ### Experimental Results 1. **Disruption Effect**: - Experimental results show that adding random noise of appropriate intensity can significantly reduce the similarity of adversarial samples while having a minimal impact on normal samples. For example, in the TIMIT and LibriSpeech databases, when the noise intensity is 50, the similarity of adversarial samples drops to 0%, while the similarity of normal samples remains high. 2. **Detection Effect**: - In command-dependent ASR systems, by adjusting the noise intensity, the attack success rate of adversarial samples can be significantly reduced without affecting the recognition accuracy of normal samples. Experimental results show that when the noise intensity is above 100, the average attack success rate (ASR avg) of adversarial samples drops below 10%. ### Conclusion The method proposed in the paper effectively disrupts and detects adversarial samples by adding random noise, enhancing the security and robustness of ASR systems. Experimental results indicate that this method performs well in different types of ASR systems.

Adversarial Example Devastation and Detection on Speech Recognition System by Adding Random Noise

Echo: Reverberation-based Fast Black-Box Adversarial Attacks on Intelligent Audio Systems.

Understanding and Benchmarking the Commonality of Adversarial Examples

Adversarial Examples Attack and Countermeasure for Speech Recognition System: A Survey.

The Silent Manipulator: A Practical and Inaudible Backdoor Attack against Speech Recognition Systems

Query-Efficient Adversarial Attack with Low Perturbation Against End-to-End Speech Recognition Systems

Defending Adversarial Attacks on Cloud-aided Automatic Speech Recognition Systems.

Adversarial Examples for Automatic Speech Recognition: Attacks and Countermeasures

Adversarial Example Detection by Classification for Deep Speech Recognition

Defending and Detecting Audio Adversarial Example Using Frame Offsets.

Toward Robust ASR System against Audio Adversarial Examples using Agitated Logit

Adversarial Privacy Protection on Speech Enhancement

Selective Audio Adversarial Example in Evasion Attack on Speech Recognition System

Adversarial Attack and Defense on Deep Neural Network-Based Voice Processing Systems: An Overview

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference.

Adversarial Example Attacks Against ASR Systems: an Overview

Imperceptible, Robust, and Targeted Adversarial Examples for Automatic Speech Recognition

Adversarial Attacks on ASR Systems: An Overview

Towards Query-Efficient Adversarial Attacks Against Automatic Speech Recognition Systems

Towards Resistant Audio Adversarial Examples

A Practical Black-Box Attack Against Autonomous Speech Recognition Model