A Parallel Fusion Approach to Piano Music Transcription Based on Convolutional Neural Network

Fu'ze Cong,Shuchang Liu,Li Guo,Geraint A. Wiggins
DOI: https://doi.org/10.1109/icassp.2018.8461794
2018-01-01
Abstract:In this paper, a supervised approach based on Convolutional Neural Networks (CNN) for polyphonic piano transcription is presented. The system consists of pitch detection model, onset/offset detection model, and note search model. The pitch detection model is a single-channel CNN predicting the probabilities of pitches contained in one frame of the audio. The onset/offset model based on dual-channel CNN is used for estimating the probabilities of each pitch's onset or offset in a frame. The note search model is rule-based; it integrates the outputs of the pitch model and onset/offset model to determine the final onset, offset and pitch of notes in audio. Two experiments with different dataset conditions are accomplished to compare with state-of-the-art approaches on the same datasets. Experimental results reveal that the proposed approach preforms better in both frame- and note-based metrics.
What problem does this paper attempt to address?