Research on News Text Clustering for International Chinese Education

Liangjie Yuan,Huizhou Zhao,Zhimin Wang
DOI: https://doi.org/10.1109/ialp61005.2023.10337054
2023-01-01
Abstract:The automatic clustering technology for news texts is increasingly mature, enabling the monitoring and analysis of the development of international Chinese education. In this study, a large-scale corpus of news texts of international Chinese education was collected. Seven classification models were used, including logistic regression, decision trees, random forests and naive Bayes, to achieve automatic classification of news texts with an accuracy rate of 85%. Additionally, three principles for data collection and five categories for news classification were established. From the classification features of news texts, this study revealed the current development status of international Chinese education in multiple countries worldwide. Through the analysis of news from different countries, distinct characteristics were identified in terms of the content, format, and reporting style of Chinese education. These findings enable the dynamic monitoring and analysis of the development trends of international Chinese education in various countries around the world.
What problem does this paper attempt to address?