Enhancement of mandarin electrolarynx speech based on voice conversion technology
Rui DONG,Lifeng LI,Haijun NIU,Wanqing SHI,Yang LI
DOI: https://doi.org/10.3969/j.issn.1002-3208.2015.04.06
2015-01-01
Abstract:Objective Electrolarynx(EL)is the most common assistant device to provide a voice for laryngectomees. However,EL still has several severe problems,such as the extremely unnaturalness and the non-ignorable radiation noises. In this paper,we conduct a study of enhancement of EL speech based on voice conversion(VC)technology in order to improve the naturalness and intelligibility of EL speech. Methods In this article,200 mandarin daily utterance pairs,recorded as normal speech and EL speech,were served as training data. A Gaussian mixed model(GMM)based method was used to improve the quality of EL speech, and subjective and objective estimation were used to evaluate converted speech. The converting features were F0 and spectrum parameters ( 0th through 24th Mel-cepstral coefficients ). Results The objective results demonstrated that the VC-based method could greatly reduce the radiation noises and improve the F0 contour of mandarin EL speech,closer to that of the target speech. The subjective results indicated that the naturalness and acceptability of mandarin EL speech were upgraded and the intelligibility had no significant difference after converting. Conclusions The VC technology can effectively reduce the high frequency radiation noises, complement tone and rhythm information,upgrade naturalness and acceptability of EL speech,which are greatly helpful to improve speech quality.