Speaker-independent Training and Adaptation of Neural Vocoders

Hong-chuan WU,Zhen-hua LING
DOI: https://doi.org/10.3969/j.issn.1000-1220.2019.02.035
2019-01-01
Abstract:In recent years,WaveNet-based neural vocoder can achieve high quality of reconstructed speech. However,it depends on the amount of speech data because of the speaker-dependent model training method. In this paper,we study the training method of neural vocoders with limited target speaker data. In our proposed method,a speaker-independent WaveNet vocoder is first trained using a multi-speaker speech corpus. Then,the parameters of the speaker-independent model are adaptively updated to obtain the neural vocoder of the target speaker. In our experiments,we compare local updating strategy with global updating strategy in adaptive training,then compare adaptive training method with speaker-dependent training method on the same training data. Experiments show that the neural vocoder constructed by our proposed method can achieve better reconstructed speech quality than STRAIGHT,and the method can achieve better objective and subjective performance than speaker-dependent training with limited target speaker data.
What problem does this paper attempt to address?