Feature extraction based on principal component analysis for text categorization

Ismail El Moudden,A. Kobbane,Safae Lhazmir
DOI: https://doi.org/10.23919/PEMWN.2017.8308030
2017-11-01
Abstract:Over the past 20 years, data has increased in a large scale in various fields. Internet of Things (IoT), for instance, comprises billions of devices and the data streams coming from these devices challenge the traditional approaches to data management and contribute to the emerging paradigm of big data. To be able to handle such data adequately, it is necessary to reduce their dimensionality to a size more compatible with the resolution methods, even if this reduction can lead to a slight loss of information. The aim of this paper is to study the potential of dimensionality reduction in text categorization of a publicly available dataset CNAE-9.
Computer Science
What problem does this paper attempt to address?