Towards Cross-lingual Social Event Detection with Hybrid Knowledge Distillation

Jiaqian Ren,Hao Peng,Lei Jiang,Zhifeng Hao,Jia Wu,Shengxiang Gao,Zhengtao Yu,Qiang Yang
DOI: https://doi.org/10.1145/3689948
IF: 4.157
2024-01-01
ACM Transactions on Knowledge Discovery from Data
Abstract:Recently published graph neural networks (GNNs) show promising performance at social event detection tasks. However, most studies are oriented toward monolingual data in languages with abundant training samples. This has left the common lesser-spoken languages relatively unexplored. Thus, in this work, we present a GNN-based framework that integrates cross-lingual word embeddings into the process of graph knowledge distillation for detecting events in low-resource language data streams. To achieve this, a novel cross-lingual knowledge distillation framework, called CLKD, exploits prior knowledge learned from similar threads in English to make up for the paucity of annotated data. Specifically, to extract sufficient useful knowledge, we propose a hybrid distillation method that consists of both feature-wise and relation-wise information. To transfer both kinds of knowledge in an effective way, we add a cross-lingual module in the feature-wise distillation to eliminate the language gap and selectively choose beneficial relations in the relation-wise distillation to avoid distraction caused by teachers’ misjudgments. Our proposed CLKD framework also adopts different configurations to suit both offline and online situations. Experiments on real-world datasets show that the framework is highly effective at detection in languages where training samples are scarce.
What problem does this paper attempt to address?