End-to-End Optical Music Recognition with Attention Mechanism and Memory Units Optimization.

Ruichen He,Junfeng Yao
DOI: https://doi.org/10.1007/978-981-99-8432-9_32
2024-01-01
Abstract:Optical Music Recognition (OMR) is a research field aimed at exploring how computers can read sheet music in music documents. In this paper, we propose an end-to-end OMR model based on memory units optimization and attention mechanisms, named ATTML. Firstly, we replace the original LSTM memory unit with a better Mogrifier LSTM memory unit, which enables the input and hidden states to interact fully and obtain better context-related expressions. Meanwhile, the decoder part is augmented with the ECA attention mechanism, enabling the model to better focus on salient features and patterns present in the input data. We use the existing excellent music datasets, PrIMuS, Doremi, and Deepscores, for joint training. Ablation experiments were conducted in our study with the incorporation of diverse attention mechanisms and memory optimization units. Furthermore, we used the musical score density metric, SnSl, to measure the superiority of our model over others, as well as its performance specifically in dense musical scores. Comparative and ablation experiment results show that the proposed method outperforms previous state-of-the-art methods in terms of accuracy and robustness.
What problem does this paper attempt to address?