Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system

Vicenç Torra,Sadaaki Miyamoto,Sergi Lanau
DOI: https://doi.org/10.1016/j.ipm.2004.01.001
2005-05-01
Abstract:The Internet, together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate.Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis.
computer science, information systems,information science & library science
What problem does this paper attempt to address?