Enhanced Screen Content Image Compression: A Synergistic Approach for Structural Fidelity and Text Integrity Preservation

Fangtao Zhou,Xiaofeng Huang,Peng Zhang,Meng Wang,Zhao Wang,Yang Zhou,Haibing Yin
DOI: https://doi.org/10.1145/3664647.3681354
2024-01-01
Abstract:With the rapid development of video conferencing and online education applications, screen content image (SCI) compression has become increasingly crucial. Recently, deep learning techniques have made significant progress in compressing natural images, surpassing the performance of traditional standards like versatile video coding. However, directly applying these methods to SCIs is challenging due to the unique characteristics of SCIs. In this paper, we propose a synergistic approach to preserve structural fidelity and text integrity for SCIs. Firstly, external prior guidance is proposed to enhance structural fidelity and text integrity by providing global spatial attention. Then, a structural enhancement module is proposed to improve the preservation of structural information by enhanced spatial feature transform. Finally, the loss function is optimized for better compression efficiency in text regions by weighted mean square error. Experimental results show that the proposed method achieves 13.3% BD-Rate saving compared to the baseline window attention convolutional neural networks (WACNN) on the JPEGAI, SIQAD, SCID, and MLSCID datasets on average. Our code is available at https://github.com/vpaHduGroup/SFTIP_SCC.
What problem does this paper attempt to address?