Securing Voice Authentication Applications Against Targeted Data Poisoning

Alireza Mohammadi,Keshav Sood,Asef Nazari,Dhananjay Thiruvady
2024-10-01
Abstract:Deep neural network-based voice authentication systems are promising biometric verification techniques that uniquely identify biological characteristics to verify a user. However, they are particularly susceptible to targeted data poisoning attacks, where attackers replace legitimate users' utterances with their own. We propose an enhanced framework using realworld datasets considering realistic attack scenarios. The results show that the proposed approach is robust, providing accurate authentications even when only a small fraction (5% of the dataset) is poisoned.
Cryptography and Security
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the security and robustness of deep neural network (DNN) - driven voice authentication systems in the face of targeted data - poisoning attacks. Specifically: 1. **Privacy and security risks**: Voice authentication systems rely on users' unique voiceprint features for identity verification, but the training data of these systems is vulnerable to malicious attackers' tampering. If an attacker replaces a legitimate user's voice sample with their own, the system may erroneously learn the attacker's features, leading to unauthorized access. 2. **Limitations of existing defense mechanisms**: - Existing defense mechanisms are insufficient in terms of efficiency and accuracy. - These methods are difficult to simulate real - world attack scenarios, so they are not effective in actual deployment. - For the situation where multiple attackers attempt to impersonate legitimate users and manipulate their profiles simultaneously, existing research lacks in - depth understanding. 3. **The impact of small - proportion data - poisoning**: Previous research has not fully explored the effectiveness of implementing covert - poisoning attacks by tampering with only a small part (for example, 0.1% to 10%) of user data. To address these problems, the author proposes an enhanced framework that combines convolutional neural network (CNN) and K - nearest neighbor (KNN) models to improve the defense ability of voice authentication systems against targeted data - poisoning attacks. This framework has been experimented on real - world datasets and has demonstrated its robustness and effectiveness in dealing with small - proportion data - poisoning attacks. ### Formula representation The formulas mentioned in the paper are as follows: - Assume that \( p_n(z_n|X_n) \) is the probability distribution of normal feature vectors given \( X_n \), where \( X_n \) is a subset of input audio of normal accounts. - \( p_a(z_a|X_a) \) is the probability distribution of attacked feature vectors given \( X_a \), where \( X_a \) is a subset of input audio of attacked accounts. The assumed expression is: \[ p_n(z_n|X_n)=p_a(z_a|X_a), \quad \text{but} \quad p(X_n)\neq p(X_a) \] This assumption indicates that although the normal and attacked feature vectors have the same conditional probability distribution, their prior distributions are different. This reflects that an attacker can change the decision boundary of the system by replacing a small amount of data, thereby achieving unauthorized access. ### Summary This paper aims to enhance the defense ability of voice authentication systems against targeted data - poisoning attacks by proposing a new framework, especially when only a small proportion of data is tampered with. The experimental results show that this framework is superior to existing methods in terms of accuracy and robustness and can effectively deal with attack scenarios in the real world.