Empirical Study on Character Level Neural Network Classifier for Chinese Text.

Tonglee Chung,Bin Xu,Yongbin Liu,Chunping Ouyang,Siliang Li,Lingyun Luo
DOI: https://doi.org/10.1016/j.engappai.2019.01.009
IF: 8
2019-01-01
Engineering Applications of Artificial Intelligence
Abstract:Character level models are drawing attention recently. A number of these models have been proposed and shown successful in Natural Language Processing tasks. While most of the models are experimented mainly on English, or other alphabetic languages, a number of problems arise when they applied these models to non-alphabetic language such as Chinese. In this study, we investigated the problems encountered when transferring these models to the Chinese and put forward some solutions. We propose a double embedding neural network model that is also character level and consists of both CNN and RNN with two separate embeddings. The model is applied to a fundamental Natural Language Processing task, text classification. Experiment results conducted on the Chinese corpus demonstrated that our character level neural network model performs just as well as or better than those word level classification models. Our model is able to reach 95.9% accuracy on a Chinese Fudan news dataset, which outperforms the state-of-the-art models.
What problem does this paper attempt to address?