Deep Networks for Degraded Document Image Binarization through Pyramid Reconstruction

Gaofeng Meng,Kun Yuan,Ying Wu,Shiming Xiang,Chunhong Pan
DOI: https://doi.org/10.1109/icdar.2017.124
2017-11-01
Abstract:Binarization of document images is an important processing step for document images analysis and recognition. However, this problem is quite challenging in some cases because of the quality degradation of document images, such as varying illumination, complicated backgrounds, image noises due to ink spots, water stains or document creases. In this paper, we propose a framework based on deep convolutional neural-network (DCNN) for adaptive binarization of degraded document images. The basic idea of our method is to decompose a degraded document image into a spatial pyramid structure by using DCNN, with each layer at different scale. Then the foreground image is sequentially reconstructed from these layers in a coarse-to-fine manner by using deconvolutional network. Such kind of decomposition is quite beneficial, since multiresolution supervision information can be directly introduced into network learning. We also define several loss functions about label consistency and foregrounds smoothing to further regularize the training of the network. Experimental results demonstrate the effectiveness of the proposed method.
What problem does this paper attempt to address?