Weakly Supervised Precise Segmentation for Historical Document Images.

Zecheng Xie,Yaoxiong Huang,Lianwen Jin,Yuliang Liu,Yuanzhi Zhu,Liangcai Gao,Xiaode Zhang
DOI: https://doi.org/10.1016/j.neucom.2019.04.001
IF: 6
2019-01-01
Neurocomputing
Abstract:With the passing of history, precious cultural heritage was left behind to tell ancient stories, especially those in the form of written documents. In this paper, a weakly supervised segmentation system with recognition-guided information on attention area, is proposed for high-precision historical document segmentation under strict intersection-over-union (IoU) requirements. We formulate the character segmentation problem from Bayesian decision theory perspective and propose boundary box segmentation (BBS), recognition-guided BBS (Rg-BBS), and recognition-guided attention BBS (Rg-ABBS), progressively, to search for the segmentation path. Furthermore, a novel judgment gate mechanism is proposed to train a high-performance character recognizer in an incremental weakly supervised learning manner. The proposed Rg-ABBS method is shown to substantially reduce time consumption while maintaining sufficiently high precision of the segmentation result by incorporating both character recognition knowledge and line-level annotation. Experiments show that the proposed Rg-ABBS system significantly outperforms traditional segmentation methods as well as deep-learning-based instance segmentation and detection methods under strict IoU requirements.
What problem does this paper attempt to address?