A Chinese Short Text Classification Method for Tax Audit Reports based on Word Importance and Syntactic Enhancement BERT

Yaning Shi,Lukun Wang,Chunpeng Tian,Rujia Wang,Jiaming Pei,Amir Hussian,Ali Kashif Bashir
DOI: https://doi.org/10.1145/3594635
IF: 1.471
2023-04-25
ACM Transactions on Asian and Low-Resource Language Information Processing
Abstract:Tax audit is an important part of the tax collection and management system, which directly affects the economic interests of the country and taxpayers. Therefore, reducing the enforcement risk in tax audit is crucial to continuously improve the tax collection and management system. Recently, the research of using deep learning to classify Chinese tax audit data to achieve this goal has attracted much attention. Inspired by BERT, this paper proposes a syntactic enhancement BERT (SE-BERT). It can improve BERT’s text understanding ability by learning input features and grammatical structure of text from text content and location embeddings. In addition, we weight the word importance calculated by TF-IDF with SE-BERT to improve the ability of recognizing local salient features. Through comparative experiments on our Chinese tax audit dataset, our method achieves better performance.
computer science, artificial intelligence
What problem does this paper attempt to address?