Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models

Vyas Raina,Mark Gales
2024-10-12
Abstract:Speech enabled foundation models, either in the form of flexible speech recognition based systems or audio-prompted large language models (LLMs), are becoming increasingly popular. One of the interesting aspects of these models is their ability to perform tasks other than automatic speech recognition (ASR) using an appropriate prompt. For example, the OpenAI Whisper model can perform both speech transcription and speech translation. With the development of audio-prompted LLMs there is the potential for even greater control options. In this work we demonstrate that with this greater flexibility the systems can be susceptible to model-control adversarial attacks. Without any access to the model prompt it is possible to modify the behaviour of the system by appropriately changing the audio input. To illustrate this risk, we demonstrate that it is possible to prepend a short universal adversarial acoustic segment to any input speech signal to override the prompt setting of an ASR foundation model. Specifically, we successfully use a universal adversarial acoustic segment to control Whisper to always perform speech translation, despite being set to perform speech transcription. Overall, this work demonstrates a new form of adversarial attack on multi-tasking speech enabled foundation models that needs to be considered prior to the deployment of this form of model.
Sound,Computation and Language,Audio and Speech Processing
What problem does this paper attempt to address?
The paper attempts to address the issue of model-control adversarial attacks that multi-task automatic speech recognition (ASR) models may face when performing specific tasks. Specifically, the paper demonstrates that even without access to model prompts, the behavior of a multi-task ASR model can be altered to perform a different task than intended by appending a short universal adversarial acoustic segment to the input audio signal. For example, the paper shows how this method can force OpenAI's Whisper model, set to perform speech transcription, to execute speech translation instead. ### Main Research Content 1. **Background and Motivation**: - Multi-task ASR models (such as Whisper) can perform various speech processing tasks, such as speech transcription and speech translation. - This flexibility introduces new security vulnerabilities, namely model-control adversarial attacks, where an attacker can change the model's task setting by modifying the input audio. 2. **Threat Model**: - The attacker cannot directly modify the model's internal structure or prompts but can achieve their goal by modifying the input audio. - The attack needs to be conducted in the acoustic space and requires the adversarial segment to be easily applicable to accommodate real-time speech processing. 3. **Attack Method**: - By appending a short universal adversarial acoustic segment to the input audio, the model is forced to perform a different task when executing a specific task. - The adversarial segment is optimized using gradient descent to maximize the probability of generating the target task without arousing suspicion. 4. **Experimental Results**: - Experiments were conducted on multiple language pairs, including French-English, German-English, Russian-English, and Korean-English, to verify the effectiveness and generalization ability of the attack. - The results show that as the intensity of the adversarial segment increases, the effect of the model-control attack gradually approaches the performance upper limit of the free translation mode. - There is a binary distribution in the success rate and translation quality of the attack, meaning the attack either completely succeeds or completely fails, with no intermediate state. ### Conclusion The paper reveals the vulnerability of multi-task speech foundation models to model-control adversarial attacks and demonstrates that adding a short universal adversarial acoustic segment can change the model's task setting. The success of such attacks exhibits a clear binary characteristic, emphasizing the need for enhanced security measures when deploying flexible ASR systems. Future research should focus on developing robust defense mechanisms against model-control adversarial attacks.