A Joint Pitch Estimation and Voicing Detection Method for Melody Extraction

Weiwei Zhang,Rong Wang,Qiaoling Zhang,Shaojun Fang
DOI: https://doi.org/10.1016/j.apacoust.2020.107338
IF: 3.614
2020-01-01
Applied Acoustics
Abstract:Melody extraction from polyphonic music is still an open problem due to the intrinsic complex nature of real-world music. Pitch estimation and voicing detection are two critical subproblems in melody extraction. In this paper, a joint pitch estimation and voicing detection method is proposed. More specifically, the unvoiced case is also assigned with one class label, the same as semitone level pitch classification. To automatically learn nonlinear features of melody, extreme learning machine (ELM) is introduced to efficiently and effectively extract melody from polyphonic music. Furthermore, to make use of unlabeled data, semi-supervised ELM (SSELM) is adopted in this work. The input weights and hidden biases are randomly generated, and the output weight matrix connecting the hidden layer and output layer are calculated by taking into account of both labeled and unlabeled data. Finally, the melody pitches predicted by SSELM are fine-tuned to follow pitch dynamic changing and obtain a smoother melody contour. The proposed method features fast learning and good generalization. Moreover, it provides a new melody extraction framework from the perspective of semi-supervised learning. The proposed method is evaluated and compared with some reference methods on three publicly available collections. Experimental results demonstrate that the proposed method obtains promising performance.
What problem does this paper attempt to address?