Impact of Channel Variation on One-Class Learning for Spoof Detection

Rohit Arora,Anmol Arora,Rohit Singh Rathore
DOI: https://doi.org/10.48550/arXiv.2109.14900
2022-06-27
Abstract:Margin-based losses, especially one-class classification loss, have improved the generalization capabilities of countermeasure systems (CMs), but their reliability is not tested with spoofing attacks degraded with channel variation. Our experiments aim to tackle this in two ways: first, by investigating the impact of various codec simulations and their corresponding parameters, namely bit-rate, discontinuous transmission (DTX), and loss, on the performance of the one-class classification-based CM system; second, by testing the efficacy of the various settings of margin-based losses for training and evaluating our CM system on codec simulated data. Multi-conditional training (MCT) along with various data-feeding and custom mini-batching strategies were also explored to handle the added variability in the new data setting and to find an optimal setting to carry out the above experiments. Our experimental results reveal that a strict restrain over the embedding space degrades the performance of the one-class classification model. MCT relatively improves performance by 35.55\%, and custom mini-batching captures more generalized features for the new data setting. Whereas varying the codec parameters made a significant impact on the performance of the countermeasure system.
Machine Learning,Audio and Speech Processing
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper mainly explores the impact of channel changes on the performance of speech spoof detection systems based on one - class learning. Specifically, the researchers focus on how these changes affect the performance of Countermeasure Systems (CMs) under different codec simulation conditions. The following are the main problems that the paper attempts to solve: 1. **The impact of channel changes on the performance of CM systems**: - The researchers evaluate the impact of these parameters on the performance of CM systems by simulating different codec parameters (such as bit rate, discontinuous transmission (DTX), packet loss rate, etc.). - In particular, they test the performance of these parameters under different settings to understand which parameters have a significant impact on system performance. 2. **The effects of multi - condition training (MCT) and custom mini - batching strategies**: - The researchers explore multi - condition training (MCT) and different data input and custom mini - batching strategies to deal with the new variability introduced by channel changes and find the optimal training settings. - For example, they test the impact of random mini - batches and customized mini - batching strategies (such as each batch containing the same number of spoof and real samples, or each batch containing samples from the same speaker or the same codec simulation) on the generalization ability of the model. 3. **The effectiveness of different loss functions**: - The researchers compare the performance of multiple margin - based loss functions (such as Softmax, AM - Softmax, OC - Softmax) when dealing with codec - simulated data. - They pay particular attention to the impact of strictly restricting the embedding space on real samples and explore the impact of different loss function settings on model performance. 4. **The generalization ability of the model in real - world scenarios**: - The researchers use the ASVspoof 2021 evaluation set to test the generalization ability of the model in new environments to verify its reliability in real - world applications. Through these experiments, the researchers hope to find an anti - spoofing detection method that can still maintain good performance in the presence of channel changes, thereby improving the security of Automatic Speaker Verification (ASV) systems. ### Formula summary The formulas involved in the paper include: - **Softmax Loss**: \[ L_S=\frac{1}{N}\sum_{i = 1}^{N}\log\left(1 + e^{(\mathbf{w}_1 - y_i-\mathbf{w}_{y_i})^T\mathbf{x}_i}\right) \] where \(N\) is the number of samples in a mini - batch, \(\mathbf{x}_i\in\mathbb{R}^D\) and \(y_i\in\{0, 1\}\) are the embedding and label respectively, and \(\mathbf{w}_0,\mathbf{w}_1\in\mathbb{R}^D\) are the weight vectors of the two classes. - **AM - Softmax Loss**: \[ L_{AMS}=\frac{1}{N}\sum_{i = 1}^{N}\log\left(1 + e^{\alpha\left(m - (\hat{\mathbf{w}}_{y_i}-\hat{\mathbf{w}}_{1 - y_i})^T\hat{\mathbf{x}}_i\right)}\right) \] - **OC - Softmax Loss**: \[ L_{OCS}=\frac{1}{N}\sum_{i = 1}^{N}\log\left(1 + e^{\alpha(m_{y_i}-\hat{\mathbf{w}}_0\hat{\mathbf{w}}_{y_i})}\right) \]