A methodology for semi-automatic classification schema building

Erika De Francesco,Salvatore Iiritano,Antonino Spagnolo,Marco Iannelli
DOI: https://doi.org/10.48550/arXiv.0910.0735
2009-10-05
Other Computer Science
Abstract:This paper describe a methodology for semi-automatic classification schema definition (a classification schema is a taxonomy of categories useful for automatic document classification). The methodology is based on: (i) an extensional approach useful to create a typology starting from a document base, and (ii) an intensional approach to build the classification schema starting from the typology. The extensional approach uses clustering techniques to group together documents on the basis of a similarity measure, whereas the intensional approach uses different operations (aggregation, reduction, generalization specialization) to define classes. keywords: ontology, classification schema, fundamentum divisionis, cluster analysis classification task.
What problem does this paper attempt to address?