A Content-Based Chinese Spam Detection Method Using a Capsule Network With Long-Short Attention

Xin Tong,Jingya Wang,Changlin Zhang,Runzheng Wang,Zhilin Ge,Wenmao Liu,Zhiyan Zhao
DOI: https://doi.org/10.1109/jsen.2021.3092728
IF: 4.3
2021-11-15
IEEE Sensors Journal
Abstract:Most existing Chinese spam detection models suffer such problems as inaccurate representation, unsatisfactory detection effect and poor practicality. To address these problems, a capsule network model combining the long-short attention mechanism is proposed here to achieve efficient Chinese spam detection. For text representation, the proposed model uses a multi-channel structure based on the long-short attention mechanism, which can capture complex text features in spam and generate contextual word vectors with more semantic information. For feature mining and classification, the model improves the structure of the traditional capsule network without compromising the classification performance and optimizes the dynamic routing algorithm, so that the model has a high accuracy without reducing the running speed. Experimental results show that the model outperformed the current mainstream methods such as TextCNN, LSTM and even BERT in characterization and detection; and it achieved an accuracy as high as 98.72% on an unbalanced dataset and 99.30% on a balanced dataset.
engineering, electrical & electronic,instruments & instrumentation,physics, applied
What problem does this paper attempt to address?