TableSegNet: a fully convolutional network for table detection and segmentation in document images

Duc-Dung Nguyen
DOI: https://doi.org/10.1007/s10032-021-00390-4
2021-11-22
International Journal on Document Analysis and Recognition (IJDAR)
Abstract:Advances in image object detection lead to applying deep convolution neural networks in the document image analysis domain. Unlike general colorful and pattern-rich objects, tables in document images have properties that limit the capacity of deep learning structures. Significant variation in size and aspect ratio and the local similarity among document components are the main challenges that require both global features for detection and local features for the separation of nearby objects. To deal with these challenges, we present TableSegNet, a compact architecture of a fully convolutional network to detect and separate tables simultaneously. TableSegNet consists of a deep convolution path to detect table regions in low resolution and a shallower path to locate table locations in high resolution and split the detected regions into individual tables. To improve the detection and separation capacity, TableSegNet uses convolution blocks of wide kernel sizes in the feature extraction process and an additional table-border class in the main output. With only 8.1 million parameters and trained purely on document images from the beginning, TableSegNet has achieved state-of-the-art F1 score at the IoU threshold of 0.9 on the ICDAR2019 and the highest number of correctly detected tables on the ICDAR2013 table detection datasets.
computer science, artificial intelligence
What problem does this paper attempt to address?