DCT-CompSegNet: fast layout segmentation in DCT compressed JPEG document images using deep feature learning
Bulla Rajesh,Sk Mahafuz Zaman,Mohammed Javed,Meng Lin
DOI: https://doi.org/10.1007/s11042-024-18204-0
IF: 2.577
2024-01-23
Multimedia Tools and Applications
Abstract:The problem of layout segmentation is still very challenging in document images like newspapers, magazines, and research articles, that have both text and non-text components arranged in an artistic way to attract various types of readers. Traditionally, layout segmentation has been carried out in the pixel domain, with an assumption that images are always available in the uncompressed pixel form. However, in reality, the images are acquired and rendered in the compressed form, and therefore the traditional techniques require additional stage of decompression to get back the images in the pixel form for further processing. Therefore, in this research paper, the idea of direct layout segmentation in compressed document images is proposed, to bypass the decompression stage and at the same time provide good performance with reduced computation time. This paper proposes to explore a novel deep learning architecture called as DCT-CompSegNet, that learns features straight from the DCT compressed streams of JPEG documents to accomplish layout segmentation directly in the JPEG compressed domain. Unlike the existing layout segmentation methods that work in pixel domain, the novelty here is that a compressed stream of DCT coefficients extracted from the JPEG documents is used to train the deep learning network. The feature learning in the model is so efficient that it is capable of accomplishing layout segmentation in both printed as well as handwritten document images with state-of-the-art performance. Experiments have been carried out using two benchmark datasets - Publay and Prima consisting of complex machine-printed document images, and the robustness of the model is also demonstrated with the self-created Handwritten dataset.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering