Page Segmentation of Chinese Newspapers

J Xi,JM Hu,LD Wu
DOI: https://doi.org/10.1016/s0031-3203(01)00248-5
IF: 8
2002-01-01
Pattern Recognition
Abstract:This paper describes a new bottom-up method for page segmentation of Chinese document images. Because of some special characteristics of Chinese newspaper documents, many traditional methods developed for English documents fail in segmenting them correctly. Based on run-length smoothing algorithm and minimal spanning tree clustering, the proposed method can resolve the problems of segmenting Chinese documents that differ from English documents.
What problem does this paper attempt to address?