A Hierarchical Method for Clustering Binary Text Image

Yiguo Pu,Jinqiao Shi,Li Guo
DOI: https://doi.org/10.1007/978-3-642-35795-4_49
2013-01-01
Abstract:Image clustering is a crucial task in image retrieving, filtering and organizing. Most of recent work focuses on dealing with color images or gray scale images with features extracted from text content, annotation or image content. This paper aims at binary text images and proposes a novel clustering method that can be used for automatic image procession in digital library and automatic office. The method is divided into three main steps. Firstly images are preprocessed to denoise, correct orientation and produce coarse classes. Secondly, features are extracted and similar images are grouped into new classes with hierarchical clustering algorithm. At last new classes are combined to the nearest old ones under distance condition. To speed clustering Local Sensitive Hash algorithm is imported for boosting merging procedure. Experiments show that this method is faster and efficient compared with the basic clustering method. © Springer-Verlag Berlin Heidelberg 2013.
What problem does this paper attempt to address?