Abstract: Automatic document content processing is affected by artifacts caused by the shape of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised methods on real data are impossible due to the large amount of data needed. Hence, the current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to a significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 26% improvement of character error rate (CER), thus, proving the practical applicability.

Learning from Documents in the Wild to Improve Document Unwarping

DewarpNet: Single-Image Document Unwarping with Stacked 3D and 2D Regression Networks.

UVDoc: Neural Grid-based Document Unwarping

DocUNet: Document Image Unwarping Via a Stacked U-Net

Adaptive dewarping of severely warped camera-captured document images based on document map generation

Layout-aware Single-image Document Flattening

MataDoc: Margin and Text Aware Document Dewarping for Arbitrary Boundary

Intrinsic Decomposition of Document Images In-the-Wild

Rethinking Supervision in Document Unwarping: A Self-consistent Flow-free Approach

Restoring Camera-Captured Distorted Document Images

Fourier Document Restoration for Robust Document Dewarping and Recognition

Arbitrary Warped Document Image Restoration Based on Segmentation and Thin-Plate Splines.

DocTr: Document Image Transformer for Geometric Unwarping and Illumination Correction

A Survey on Deep learning based Document Image Enhancement

A Fusion Framework of Whitespace Smear Cutting and Swin Transformer for Document Layout Analysis

Table Image Dewarping with Key Element Segmentation

Remote Sensing Image Rectangling With Iterative Warping Kernel Self-Correction Transformer

Effective Document Image Rectification via a Deep Learning Framework

Learning Residual Elastic Warps for Image Stitching under Dirichlet Boundary Condition

Efficient Joint Rectification of Photometric and Geometric Distortions in Document Images.

Deep Unrestricted Document Image Rectification.