Image Layer Modeling for Complex Document Layout Generation.

Tianlong Ma,Xingjiao Wu,Xiangcheng Du,Yanlong Wang,Cheng Jin
DOI: https://doi.org/10.1109/ICME55011.2023.00386
2023-01-01
Abstract:Document layout analysis (DLA) plays an essential role in information extraction and document understanding. At present, DLA has reached the milestone achievement; however, DLA of non-Manhattan is still challenging because of annotation data limitations. In this paper, we propose an image layer modeling method to mitigate this issue. The image layer modeling method generates document images of non-Manhattan layouts by superimposing images under pre-defined aesthetic rules. Due to the lack of evaluation benchmark for non-Manhattan layout, we have constructed a manually-labeled non-Manhattan layout fine-grained segmentation dataset. To the best of our knowledge, this is the first manually-labeled non-Manhattan layout fine-grained segmentation dataset. Extensive experimental results verify that our proposed image layer modeling method can better deal with the fine-grained segmented document of the non-Manhattan layout.
What problem does this paper attempt to address?