Auto-Indexing Based on Chinese Characters Coding on Words Platform

JIAO Hui,LIU Qian,JIA Hui-bo
DOI: https://doi.org/10.3321/j.issn:1002-8331.2007.15.053
2007-01-01
Computer Engineering and Applications Journal
Abstract:Auto-indexing is one of the key techniques of information retrieval based on contents.At present the research on Chinese auto-indexing mainly focuses on automatic segmentation which is a predisposal problem.This paper presents a kind of Chinese characters coding method on words platform,and establishes a new Chinese text format in computer which makes words the smallest information unit.Based on this method,auto-indexing does not rely on segmentation as before.Thereby the efficiency and quality of auto-indexing would be improved.
What problem does this paper attempt to address?