Robust Piano Music Transcription Based on Computer Vision

Jun Li,Wei Xu,Yong Cao,Wei Liu,Wenqing Cheng
DOI: https://doi.org/10.1145/3409501.3409540
2020-01-01
Abstract:Recently, automatic music transcription aiming to convert acoustic music signals into symbolic notations attracts increasing attention. In order to deal with the challenges of automatic music transcription based on acoustic information, traditional approaches adopt hough transform to locate the piano keyboard and a weak classifier to detect pressed keys. However, the hough transform and weak classifier show insufficient detection ability in the changing environment. In this paper, we devise a robust visual piano transcription system using semantic segmentation for the piano keyboard detection and a CNN-based classifier to detect the pressed keys, which improves the frame-level transcription results. In addition, in view of lacking public datasets in the field of visual piano transcription, we further propose a new dataset for visual piano transcription. To demonstrate the effectiveness of our system, we evaluate it on both the published dataset and we proposed, and our system significantly outperforms the state-of-the-art approaches.
What problem does this paper attempt to address?