Abstract:Automatic speaker verification systems (ASVs) verify a person’s identity by his/her voice and have been widely deployed for user authentication. However, existing ASVs are based on traditional audio spectral features and hence, perform poorly in verifying pitch-changed utterances from speakers with cold or sore throat. In this article, we propose soundfield tracker (SOFTER) , a soundfield-based speaker verification system that can verify speakers regardless of the pitch changes. SOFTER is based on the observation that soundfield features reflect the speaker’s vocal tract, mouth, head, torso, etc., which are less affected by the pitch changes in speech signals. SOFTER can be integrated into off-the-shelf smartphones without any hardware modifications. One major challenge is that the soundfield is sensitive to the distance between the speaker and the phone. To solve this problem, we propose a two-stage mechanism combining distance sensing and soundfield reconstruction, which enables to reconstruct the soundfield to a setting similar to the one in the enrollment phase, thus, the speaker can be verified from any distance to the phone. We compare SOFTER with six state-of-the-art academic and commercial ASVs on two data sets of 134 speakers and 31000 speech samples. Results show that SOFTER has an equal error rate (EER) of 2.18% and 1.61% on the two data sets, respectively. Moreover, SOFTER outperforms other ASVs by at least 24.67% on average in verifying pitch-varying or pathological speech samples, denoting an evidence of SOFTER ’s effectiveness in both normal and unhealthy user conditions.

Speaker Verification Based on Prosodic Features

Speaker verification based on prosodic features

Prosodic Features-Based Speaker Verification Using Speaker-Specific-text for Short Utterances.

Prosodic Features Based Text-dependent Speaker Recognition with Short Utterance.

Toward Pitch-Insensitive Speaker Verification Via Soundfield

Speaker Verification Based on Factor Analysis and SVM

Speaker Verification Using Simple Temporal Features and Pitch Synchronous Cepstral Coefficients

A Speaker Verification Method Based on MFCC and Prosodic Features

Multi-layered Features with SVM for Text-independent Speaker Verification

Support Vector Regression Machine Adopted Speaker Verification System

Advances in SVM-Based System Using GMM Super Vectors for Text-Independent Speaker Verification

Interfusing the Confused Region Score of Speaker Verification Systems

Exploiting Prosodic Information for Speaker Recognition

Speaker Verification Based on Pitch Classified Feature Mapping

Speaker verification system based on M-vector and support vector machine

A PCA Method Based on Speaker Session Variability

The Estimation And Kernel Metric Of Spectral Correlation For Text-Independent Speaker Verification

New Adaptation Method Using Two-Dimensional Pca for Speaker Verification

Channel Compensation Technology In Differential Gsv-Svm Speaker Verification System

Integrating Sound Source Features for Robust Speaker Verification

Svm-Based Speaker Verification by Location in the Space of Reference Speakers