Big data-assisted urban governance: A comprehensive system for business documents classification of the government hotline
Zicheng Zhang,Anguo Li,Li Wang,Wei Cao,Jianlin Yang
DOI: https://doi.org/10.1016/j.engappai.2024.107997
IF: 8
2024-02-04
Engineering Applications of Artificial Intelligence
Abstract:The government service platform, exemplified by the government hotline, has to handle extensive volumes of business documents that contain rich and timely public opinion information and citizens' demands. However, manual processing struggles to process large-scale text data, adversely impacting operating costs and the quality of government services. This study proposes a comprehensive system for business document classification of the government hotline (BDCGHS) in China to address these challenges. BDCGHS leverages information entropy fused with term frequency-inverse document frequency (TF-IDF) weight to mine new words from business documents of the government hotline, and store them in a new word repository. These new words optimize Chinese word segmentation and text representation for text classification. We introduce a novel data structure called nested balanced binary tree to expedite new word mining, yielding a computational speed of almost five times than the Trie trees. Comparative experiments on the THUNews and government hotline datasets validate our proposed improvement BDCGHS algorithm's superior performance 3 % over text classification algorithms. Compared to the latest bidirectional encoder representations from the transformers (BERT) model, BDCGHS enhances the accuracy of order dispatch based on business documents by almost 3 %. It has also demonstrated stable operations in two Chinese cities for over a year, yielding favorable results.
automation & control systems,computer science, artificial intelligence,engineering, electrical & electronic, multidisciplinary