Two Methods for Spoofing-Aware Speaker Verification: Multi-Layer Perceptron Score Fusion Model and Integrated Embedding Projector

Jungwoo Heo,Ju-ho Kim,Hyun-seo Shin

DOI: https://doi.org/10.21437/Interspeech.2022-602

2022-06-28

Abstract:The use of deep neural networks (DNN) has dramatically elevated the performance of automatic speaker verification (ASV) over the last decade. However, ASV systems can be easily neutralized by spoofing attacks. Therefore, the Spoofing-Aware Speaker Verification (SASV) challenge is designed and held to promote development of systems that can perform ASV considering spoofing attacks by integrating ASV and spoofing countermeasure (CM) systems. In this paper, we propose two back-end systems: multi-layer perceptron score fusion model (MSFM) and integrated embedding projector (IEP). The MSFM, score fusion back-end system, derived SASV score utilizing ASV and CM scores and embeddings. On the other hand,IEP combines ASV and CM embeddings into SASV embedding and calculates final SASV score based on the cosine similarity. We effectively integrated ASV and CM systems through proposed MSFM and IEP and achieved the SASV equal error rates 0.56%, 1.32% on the official evaluation trials of the SASV 2022 challenge.

Audio and Speech Processing,Sound

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the vulnerability of Automatic Speaker Verification (ASV) systems when facing spoofing attacks. Specifically, although methods based on Deep Neural Networks (DNN) have significantly improved the performance of ASV systems in the past decade, these systems are vulnerable to spoofing attacks, which may lead to system failure. Therefore, this paper proposes two back - end systems - the Multi - Layer Perceptron Score Fusion Model (MSFM) and the Integrated Embedding Projector (IEP), aiming to combine ASV and Countermeasure (CM) systems to improve the robustness of the system, especially the ability to perform speaker verification in the context of spoofing attacks. The design goals of these two systems are to improve the performance of the Spoofing - Aware Speaker Verification (SASV) task through effective fusion strategies without modifying or retraining the existing ASV and CM systems. MSFM generates the final SASV score by using the scores and embeddings of ASV and CM; while IEP combines the embeddings of ASV and CM into SASV embeddings and calculates the final SASV score based on cosine similarity. Through these two methods, the author achieved a significant performance improvement in the SASV 2022 challenge, reaching Equal Error Rates (EER) of 0.56% and 1.32%, demonstrating the effectiveness of the proposed methods.

Two Methods for Spoofing-Aware Speaker Verification: Multi-Layer Perceptron Score Fusion Model and Integrated Embedding Projector

Spoofing-Aware Speaker Verification by Multi-Level Fusion

Siamese Network with Wav2vec Feature for Spoofing Speech Detection

Tackling Spoofing-Aware Speaker Verification with Multi-Model Fusion.

Spoofing-Robust Speaker Verification Using Parallel Embedding Fusion: BTU Speech Group's Approach for ASVspoof5 Challenge

Towards single integrated spoofing-aware speaker verification embeddings

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification

Backend Ensemble for Speaker Verification and Spoofing Countermeasure

Integrated Replay Spoofing-Aware Text-Independent Speaker Verification

SA-SASV: An End-to-End Spoof-Aggregated Spoofing-Aware Speaker Verification System

Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection

SASV Based on Pre-trained ASV System and Integrated Scoring Module

Voice Presentation Attack Detection Using Convolutional Neural Networks

SASV Challenge 2022: A Spoofing Aware Speaker Verification Challenge Evaluation Plan

Generalizing Speaker Verification for Spoof Awareness in the Embedding Space

The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge

SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

Norm-constrained Score-level Ensemble for Spoofing Aware Speaker Verification

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference.

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches