A detector for page-level handwritten music object recognition based on deep learning

Yusen Zhang,Zhiqing Huang,Yanxin Zhang,Keyan Ren
DOI: https://doi.org/10.1007/s00521-023-08216-6
2023-01-23
Neural Computing and Applications
Abstract:Handwritten music recognition (HMR) is the technology of transcribing the content of images of music scores. The accurate detection of music objects at the page level is one of the main challenges of HMR. Thus far, the existing methods suffer from the tiny and dense nature of handwritten music notations and realize positive detection accuracy only on snippets. In this paper, we propose a detector that consists of a staff line removal model and a handwritten music object detection model for page-level handwritten music object recognition. First, an end-to-end staff line removal model R_Staff_Net based on residual learning reduces the complexity of page-level detection. Second, we developed an improved YOLO-V4 model for handwritten music object detection. The improvements mainly concern the adoption of a de-coupled detection head and visual attention module in the YOLO-V4, and an adaptive multi-scale feature fusion module AMFFM is used to enhance the textures and features of tiny music symbols in the deep convolution layers, and the gradient harmonized mechanism is utilized to address the inherent imbalance between music objects. We verified the R_Staff_Net and the improved YOLO-V4 model on the ICDAR/GREC staff line removal dataset and the MUSCIMA++ dataset, respectively. The experiments highlight that R_Staff_Net presents outstanding performance with an F-M score of 98.64%, and our improved YOLO-V4 model is superior to other handwritten music symbol detection methods with a mean average precision (mAP) of 91.8% when addressing page-level input. Although the experimental results of the detector show that the R_Staff_Net helps little to the overall mAP, the network is beneficial for symbols that are similar to staff lines or heavily overlap staff lines.
computer science, artificial intelligence
What problem does this paper attempt to address?