Abstract:For the voice conversion of different ages,a method using Universal Background Model Groups (UBMG) of short-time spectra and prosodic features is proposed.In spectrum aspect,Gaussian Mixture Model (GMM) is trained for every speaker after extracting linear predictive cepstrum coefficients,then the speakers in the same age period are clustered based on their voice similarity,and each cluster is further trained to be a UBM of spectrum distribution.Finally,an UBM group and corresponding spectrum conversion functions are obtained in each age period.Formants adjustment is further used after spectrum conversion.Furthermore,fundamental frequency and speech rate are modeled by single Gaussian and average duration rate respectively to derive their conversion functions in the aspect of prosodic features.The results of objective and subjective evaluation experiments such as ABX and MOS show that the proposed method has a distinct advantage compared with conventional bilinear method and its change rate of log-likelihood ratio increases by 4％ compared with single UBM method.The results show the proposed method can make the converted speech more close to the speech of target age period with good speech quality while the performance has been improved evidently compared with conventional methods.

Voice Conversion with Ubm and Speaker-Specific Model Adaptation

Non-parallel training for voice conversion based on adaptation method

A Voice Conversion Algorithm in the Context of Sparse Training Data

Voice conversion of different ages using universal background model groups of short-time spectra and prosodic features

Voice Conversion towards Arbitrary Speakers With Limited Data.

Voice Conversion Based On Straight And Ubm-Gmm

Voice Conversion Based on Unified Dictionary with Clustered Features Between Non-parallel Corpus

A Novel Iterative Speaker Model Alignment Method from Non-Parallel Speech for Voice Conversion.

Voice Conversion Based on Isolated Speaker Model

Sequential UBM Adaptation for Speaker Verification

Text-Independent Voice Conversion Based on State Mapped Codebook

Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment

Voice conversion with a strategy for separating speaker individuality using state-space model

Fast Model Alignment for Structured Statistical Approach of Non-Parallel Corpora Voice Conversion

Voice Conversion Based on Hybrid SVR and GMM

Improving the Performance of MGM-based Voice Conversion by Preparing Training Data Method

Voice Conversion Based on Gaussian Mixture Modules with Minimum Distance Spectral Mapping

Voice Conversion Using Support Vector Regression

EigenVoice Used in Speaker Recognition with a Few Training Samples

Voice Conversion with Smoothed GMM and MAP Adaptation

Multigrained Model Adaptation With Map And Reference Speaker Weighting For Text Independent Speaker Verification