Clickbait detection on WeChat: A deep model integrating semantic and syntactic information
Tong Liu,Ke Yu,Lu Wang,Xuanyu Zhang,Hao Zhou,Xiaofei Wu
DOI: https://doi.org/10.1016/j.knosys.2022.108605
2022-06-01
Abstract:In online social media, there is a large amount of clickbait using various tricks such as curious words and well-designed sentence structures, to attract users to click on hyperlinks for unknown benefits. Clickbait detection aims to detect these hyperlinks through automated algorithms. Previous researches usually focus on the semantic information of the English clickbait corpus. In our paper, we construct a Chinese WeChat clickbait dataset, and propose an effective deep method, i.e., multiple features for WeChat clickbait detection (MFWCD), by integrating semantic, syntactic and auxiliary information. Based on the MFWCD framework, we propose two models with different parameter scales, namely MFWCD-BERT and MFWCD-BiLSTM, which respectively use Bidirectional Encoder Representation from Transformers (BERT) and lightweight Bidirectional Long Short-Term Memory (Bi-LSTM) network with attention mechanism to encode title semantics. In addition, we propose an improved Graph Attention Network (GAT) to aggregate local syntactic structures of titles and use attention mechanism to capture valuable structures. Finally, an auxiliary feature related to user reading behavior is introduced to obtain a richer title representation. Sufficient experiments prove the effectiveness and interpretability of our MFWCD for clickbait detection, and the performance is better than compared baseline methods.
computer science, artificial intelligence