Piano Transcription with Harmonic Attention.

Ruimin Wu,Xianke Wang,Yuqing Li,Wei Xu,Wenqing Cheng
DOI: https://doi.org/10.1109/ICASSP48485.2024.10447324
2024-01-01
Abstract:Automatic Music Transcription (AMT) aims to convert music audio into digital sheet music. Piano transcription is a popular but challenging subtask of AMT. For every piano pitch, the harmonic structure is fixed in the frequency domain, while the Transformer based on self-attention has great potential to extract features in the long sequence. In this paper, we propose piano harmonic attention, a mask self-attention, for better capturing harmonic features. The mask matrix is designed with the harmonic prior to pre-modeling the harmonic structure during calculating attention scores. To verify its effectiveness, we append the harmonic attention-based Transformer after every convolutional neural network block of the High-resolution piano transcription system. The evaluation results on the MAESTRO dataset show that the proposed model achieves comprehensive improvements over the baseline, with a note F1 score of 97.33%, which is comparable to the state-of-the-art system.
What problem does this paper attempt to address?