Chinese text classification method using FastText and term frequency-inverse document frequency optimization

Tiantian Zhou,Yintong Wang,Xin Zheng
DOI: https://doi.org/10.1088/1742-6596/1693/1/012121
2020-12-01
Journal of Physics: Conference Series
Abstract:Abstract With the development of information technology, obtaining information quickly and accurately has become an indispensable part of people’s lives. Text classification can filter and organize massive text data to find valuable information, and its practical application is of great significance. This paper proposes Chinese text classification method using FastText and term frequency-inverse document frequency optimization(I-FastText). This method introduces term frequency-inverse document frequency(TF-IDF) optimization into the input layer of FastText model, removes words with high frequency and low discrimination ability, and achieves high-quality word vectors to train the text classification model. The experimental results on the THUCNews dataset shown that the Chinese text classification accuracy of I-FastText is significantly better than TF-IDF and FastText methods.
English Else
What problem does this paper attempt to address?