Latent time-frequency component analysis: A novel pitch-based approach for singing voice separation

Xiu Zhang,Wei Li,Bilei Zhu
DOI: https://doi.org/10.1109/ICASSP.2015.7177946
2015-01-01
Abstract:Monaural singing voice separation has aroused considerable attention. Many pitch-based methods have been proposed to address this task, but generally have limited performance. The most crucial difficulties lie in the inaccurate judgment on voiced pitches and the failed recognition on unvoiced singing sounds. In this paper, we propose a novel algorithm based on the latent component analysis of time-frequency representation to overcome these difficulties. Specifically, the time-frequency (T-F) representations of the song are firstly decomposed into components, and each component approximately originates from a single sound source. We then construct non-overlapping T-F segments with these components, to complete the omitted useful singing voice information. Extensive experiments on the MIR-1K public dataset shows the effectiveness of the proposed algorithm.
What problem does this paper attempt to address?