Melody Extraction Based on Deep Harmonic Neural Network

Yuzhi Huang,Gang Liu
DOI: https://doi.org/10.1109/ICNIDC.2018.8525721
2018-01-01
Abstract:The main task of the melody extraction is to extract the fundamental frequency contour of the vocal music in the polyphonic music in which the vocal music and the background music are mixed. There are many applications in the music information retrieval. In the paper of Fujishima, Hermes et al, the theory of subharmonic sum (SHS) was applied to extract the fundamental frequency[1] [2]. Sangeun Kum et al. applied neural networks to extract melody and obtain a state-of-the-art result[3] [4]. In this paper, the theory of harmonic structure and neural network are combined together, and a new network DHNN (Deep Harmonic Neural Network) is proposed, which is applied in the melody extraction. Compared with the method without neural network, the new network DHNN introduces the supervised learning and Sequence-to-sequence relationship. Compared with the neural network method of Sangeun Kum, the harmonic structure is introduced, making the new network, DHNN more suitable for melody extraction. This paper fulfill two kinds of the new network, RNN-DHNN and CNN-DHNN, the results we have obtained in the experiments are closed to, even beyond the state-of-the-art on MIREX1k and mirex05 datasets.
What problem does this paper attempt to address?