Abstract:Prior researchers show that existing automatic speech recognition (ASR) systems are vulnerable to adversarial examples. Most existing adversarial attacks against ASR systems are either white- or gray-box, limiting their practical usage in the real world. Some black-box attacks also assume the knowledge of output probability vectors to infer output distribution. Other black-box attacks leverage inefficient heavyweight processes, i.e., training auxiliary models or estimating gradients. Moreover, they require input-specific and manual hyperparameter tuning to improve the attack success rate against a specific ASR system. Despite such a heavyweight tuning process, nearly or even more than half of the generated adversarial examples are perceptible to humans. This paper designs KENKU, an efficient and stealthy blackbox adversarial attack framework against ASRs, supporting hidden voice command and integrated command attacks. It optimizes the novel acoustic feature loss and perturbation loss, based on Mel-frequency Cepstral Coefficients (MFCC). Both loss values can be calculated locally, avoiding training auxiliary models or estimating gradients, making the attack efficient. Furthermore, we introduce a hyperparameter in optimization that balances the attack effectiveness and imperceptibility automatically. KENKU uses the binary search algorithm to find its optimal value. We evaluated our prototype on eight real-world systems (including five digital and three physical attacks) and compared KENKU with five state-of-the-art works. Results show that KENKU can outperform existing works in the attack performance.

Evaluating and Enhancing the Robustness of Retrieval-Based Dialogue Systems with Adversarial Examples.

Attack As Defense: Characterizing Adversarial Examples Using Robustness.

Adversarial Learning for Neural Dialogue Generation.

Extracting Robust Models with Uncertain Examples

White-Box Multi-Objective Adversarial Attack on Dialogue Generation

On Evaluating Adversarial Robustness of Large Vision-Language Models

A Prompting-based Approach for Adversarial Example Generation and Robustness Enhancement

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples

Probing the Robustness of Trained Metrics for Conversational Dialogue Systems

RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models

Adversarial Attacks and Defense for Conversation Entailment Task

Synthesizing Adversarial Negative Responses for Robust Response Ranking and Evaluation

Enhancing Model Robustness By Incorporating Adversarial Knowledge Into Semantic Representation

Improving Robustness of Task Oriented Dialog Systems

Assessing Adversarial Robustness of Large Language Models: An Empirical Study

KENKU: Towards Efficient and Stealthy Black-box Adversarial Attacks against ASR Systems

Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts

Evaluating and Safeguarding the Adversarial Robustness of Retrieval-Based In-Context Learning

Adversarial Examples Attack and Countermeasure for Speech Recognition System: A Survey.