HiFi-SVC: Fast High Fidelity Cross-Domain Singing Voice Conversion.

Yong Zhou,Xiangju Lu
DOI: https://doi.org/10.1109/icassp43922.2022.9746812
2022-01-01
Abstract:This paper presents HiFi-SVC, a small cross-domain singing voice conversion model for generating high-fidelity 22.05 kHz singing voices. Building on state-of-the-art neural vocoder HiFi-GAN and a convolution-based module for modeling F0, HiFi-SVC can be trained end-to-end with either speech or singing data, achieving better voice similarity on two of the datasets than FastSVC while using slightly smaller number of parameters. We also propose a pitch adjustment method for improving conversion quality.
What problem does this paper attempt to address?