High-Resolution Document Shadow Removal via A Large-Scale Real-World Dataset and A Frequency-Aware Shadow Erasing Net

Zinuo Li,Xuhang Chen,Chi-Man Pun,Xiaodong Cun
2024-06-18
Abstract:Shadows often occur when we capture the documents with casual equipment, which influences the visual quality and readability of the digital copies. Different from the algorithms for natural shadow removal, the algorithms in document shadow removal need to preserve the details of fonts and figures in high-resolution input. Previous works ignore this problem and remove the shadows via approximate attention and small datasets, which might not work in real-world situations. We handle high-resolution document shadow removal directly via a larger-scale real-world dataset and a carefully designed frequency-aware network. As for the dataset, we acquire over 7k couples of high-resolution (2462 x 3699) images of real-world document pairs with various samples under different lighting circumstances, which is 10 times larger than existing datasets. As for the design of the network, we decouple the high-resolution images in the frequency domain, where the low-frequency details and high-frequency boundaries can be effectively learned via the carefully designed network structure. Powered by our network and dataset, the proposed method clearly shows a better performance than previous methods in terms of visual quality and numerical results. The code, models, and dataset are available at: <a class="link-external link-https" href="https://github.com/CXH-Research/DocShadow-SD7K" rel="external noopener nofollow">this https URL</a>
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the issue of high-resolution document shadow removal. Specifically: 1. **Shortcomings of existing methods**: - Current natural shadow removal algorithms fail to preserve details such as fonts and images when processing documents. - Most existing document shadow removal methods are only suitable for relatively low-resolution images and struggle to handle high-resolution images directly. - There is a lack of large-scale, high-resolution dedicated datasets. 2. **Proposed method**: - The authors provide a dataset containing over 7,000 pairs of real-world high-resolution document images (SD7K), captured under different lighting conditions, with manually annotated shadow masks. - They designed a network structure called FSENet (Frequency-aware Shadow Erasing Network), which can effectively handle shadows in high-resolution images by processing the low-frequency and high-frequency parts separately. 3. **Main contributions**: - Providing a large-scale real-world high-resolution document shadow dataset SD7K. - Proposing a frequency decomposition-based network structure FSENet, specifically for handling high-resolution document shadows. - Conducting qualitative and quantitative analysis on all available public datasets, demonstrating that the proposed FSENet outperforms existing methods. Through these efforts, the paper addresses the shortcomings of existing methods in high-resolution document shadow removal, improving the effectiveness and practicality of shadow removal.