Abstract:Speaker verification performance degrades when input speech is tested in different sessions over a long period of time chronologically. Common ways to alleviate the long-term impact on performance degradation are enrollment data augmentation, speaker model adaptation, and adapted verification thresholds. From a point of view in features of a pattern recognition system, robust features that are speaker-specific, and invariant with time and acoustic environments are preferred to deal with this long-term variability. In this paper, with a newly created speech database, CSLT-Chronos, specially collected to reflect the long-term speaker variability, we investigate the issues in the frequency domain by emphasizing higher discrimination for speaker-specific information and lower sensitivity to time-related, session-specific information. F-ratio is employed as a criterion to determine the figure of merit to judge the above two sets of information, and to find a compromise between them. Inspired by the feature extraction procedure of the traditional MFCC calculation, two emphasis strategies are explored when generating modified acoustic features, the pre-filtering frequency warping and the post-filtering filter-bank outputs weighting are used for speaker verification. Experiments show that the two proposed features outperformed the traditional MFCC on CSLT-Chronos. The proposed approach is also studied by using the NIST SRE 2008 database in a state-of-the-art, i-vector based architecture. Experimental results demonstrate the advantage of proposed features over MFCC in LDA and PLDA based i-vector systems. (c) 2016 Elsevier B.V. All rights reserved.

Robust Speaker Recognition in Cross-Channel Condition

Robust Speaker Recognition in Cross-Channel Condition Based on Gaussian Mixture Model

Toward Pitch-Insensitive Speaker Verification Via Soundfield

Glottal Information Based Spectral Recuperation in Multi-channel Speaker Recognition

Affect-Insensitive Speaker Recognition by Feature Variety Training

Channel Adversarial Training for Cross-channel Text-independent Speaker Recognition

Robust Channel Learning for Large-Scale Radio Speaker Verification

Research on Intersession Variability Compensation for MLLR-SVM Speaker Recognition.

Channel Compensation Technology In Differential Gsv-Svm Speaker Verification System

Robust telephone speech recognition based on channel compensation

Self-attention Based Speaker Recognition Using Cluster-Range Loss

Short Utterance Speaker Recognition Based on Speech High Frequency Information Compensation and Dynamic Feature Enhancement Methods

Improving Speaker Verification Performance Against Long-Term Speaker Variability

Robust speaker recognition using glottal information‐based cepstral mean subtraction

A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification

A speaker verification backend with robust performance across conditions

Adversarial Speaker Verification.

An Approach To Robust Speaker Recognition Using Stochastic Matching

Speaker Recognition System in Multi-Channel Environment

Channel Compensation for Robust Telephone Speech Recognition

An Algorithm of Model Compensation Based on the Estimation of Additive Noise and Channel Function for Speech Recognition