Understanding and Benchmarking the Commonality of Adversarial Examples
Ruiwen He,Yushi Cheng,Junning Ze,Xiaoyu Ji,Wenyuan Xu
DOI: https://doi.org/10.1109/sp54263.2024.00111
2024-01-01
Abstract:Speech recognition system converts audio into texts by utilizing deep learning algorithms. Numerous works have demonstrated various adversarial example (AE) attacks, i.e., adding carefully-crafted noises can trick the speech recognition system into outputting completely incorrect texts. This paper aims to reveal the distinctive properties of adversarial audio in terms of phonetics. We believe analyzing the distinctive properties is critical in understanding adversarial attacks on ASR models, as well as guiding the generation and defense of AEs. Thus, we aim to answer three questions: (1) What are the distinctive properties of adversarial audio that are common to diverse attacks? (2) How to quantify these distinctive properties? (3) How can we use these properties to improve the security of ASR models? To answer these questions, we perform a large-scale measurement based on acoustic features and statistical analysis. By measuring a total of 612,000 acoustic-statistical feature vectors for 2,400 audio samples, we obtain four insights on the distinctive properties, i.e., filling energy gap, speech-like morphology, disordered signal, and abnormal linguistic pattern. Based on these properties, we design a naturalness score to assess the stealthiness of attacks and propose an adversarial example detector with an average accuracy of 91.1%.