Template-based text field segmentation for ID documents using dynamic squeezeboxes packing

Zingerenko, Michael,Limonova, Elena,Arlazarov, Vladimir V.
DOI: https://doi.org/10.1007/s11042-024-20162-6
IF: 2.577
2024-09-19
Multimedia Tools and Applications
Abstract:In this paper, we focus on the problem of text field segmentation in identity documents. These documents, characterized by their fixed layouts, present an opportunity to apply computationally efficient template-based algorithms. We consider the Dynamic Squeezeboxes Packing method and demonstrate its integration into document recognition systems, utilizing a single sample per document type. We benchmark text field segmentation on the MIDV-2019 public dataset using standard intersection-over-union and our custom intersection-over-template metrics, while also measuring processing time. We demonstrate that Dynamic Squeezeboxes Packing maintains competitive quality compared to text in the wild methods (EAST, CRAFT) and named-entity recognition method (LayoutLMv2). A significant advantage of this method is its processing speed, averaging 9 ms per image on the x86_64 platform, which is substantially faster than EAST (980 ms), CRAFT (2030 ms), and LayoutLMv2 (2210 ms). The obtained results suggest that the considered method has strong potential as a method in document image analysis, particularly for processing identity documents.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?