Learning from Documents in the Wild to Improve Document Unwarping

Ke Ma,Sagnik Das,Zhixin Shu,Dimitris Samaras
DOI: https://doi.org/10.1145/3528233.3530756
2022-01-01
Abstract:Document image unwarping is important for document digitization and analysis. The state-of-the-art approach relies on purely synthetic data to train deep networks for unwarping. As a result, the trained networks have generalization limitations when testing on real-world images, often yielding unsatisfying results. In this work, we propose to improve document unwarping performance by incorporating real-world images in training. We collected Document-in-the-Wild (DIW) dataset contains 5000 captured document images with large diversities in content, shape, and capturing environment. We annotate the boundaries of all DIW images and use them for weakly supervised learning. We propose a novel network architecture, PaperEdge, to train with a hybrid of synthetic and real document images. Additionally, we identify and analyze the flaws of popular evaluation metrics, e.g., MS-SSIM and Local Distortion (LD), for document unwarping and propose a more robust and reliable error metric called Aligned Distortion (AD). Training with a combination of synthetic and real-world document images, we demonstrate state-of-the-art performance on popular benchmarks with comprehensive quantitative evaluations and ablation studies. Code and data are available at https://github.com/cvlab-stonybrook/PaperEdge.
What problem does this paper attempt to address?