Replay spoof detection for speaker verification system using magnitude-phase-instantaneous frequency and energy features

K. P. Bharath,M. Rajesh Kumar
DOI: https://doi.org/10.1007/s11042-022-12380-7
IF: 2.577
2022-04-29
Multimedia Tools and Applications
Abstract:Spoofing attack detection is one of the essential components in automatic speaker verification (ASV) systems. The success of\ ASV-2015 shows a great perspective by detecting the voice conversion and speech synthesis spoofs. However, the researchers address fewer replay attack spoof detection systems, and non-professional impersonators most likely use the replay attacks. This paper detects replay attacks on the ASV system using the ASVspoof-2017-v2.0 corpus. This work is mainly partitioned into two parts. The first part shows the significance of Empirical Mode Decomposition (EMD) and Hilbert Spectrum (HS) to detect the replay attack detection by extracting the instantaneous frequency (IF) and instantaneous energies (IE) from frequency components of the speech signal to differentiate the characteristics of genuine and spoof speech, then it given to rectangular filter cepstral coefficients (RFCC) to obtain the desired set of features to detect whether the given speech sample is genuine or spoof. In the second part, a new score-level fusion system is proposed to increase the system performance. Along with the proposed stand-alone method, Constant-Q cepstral coefficients (CQCC) and All-Pole Group Delay Function (APGDF) methods are used to extract the magnitude and phase features set, respectively. The proposed stand-alone and score-level fusion method improves performance accuracy than other state-of-art techniques.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?