High Efficient Compression Strategy for Scanned Receipts and Handwritten Documents

Danhua Xu,Xudong Bao
DOI: https://doi.org/10.1109/ICISE.2009.632
2009-01-01
Abstract:Image compression is one of the traditional topics in image processing and has been widely discussed and applied. Some standards, such as, JPEG and JPEG 2000, have also been published for the applications dealing with gray or color photos and medical images. However, for some specific applications, such as, electronic financial management systems (eFMS), much higher efficient algorithms have to be designed for the compression of receipts or handwritten documents. A new strategy is discussed for the compression based on the separation of foreground and background according to the assumption that less degradation of foreground is allowed because of the most important information represented, while more degradation of background is acceptable because it only provides the sense of reality of the document. The image is firstly transformed to YCbCr color space to separate intensities from tones. Then, foreground and background are extracted from the intensity subimage with median filter. Both foreground and background are down-sampled and respectively clustered based on the gray histograms. The chromatic aberration subimages are also down-sampled and transformed to palette-index model by the clustering based on the 2D histogram. All clustered subimages are encoded with JPEG introduced RLE algorithm and synthesized finally. The results demonstrated much higher compression rates of presented strategy than that of JPEG standard.
What problem does this paper attempt to address?