DAFS: a domain aware few shot generative model for event detection

Nan Xia,Hang Yu,Yin Wang,Junyu Xuan,Xiangfeng Luo
DOI: https://doi.org/10.1007/s10994-022-06198-5
IF: 5.414
2022-09-05
Machine Learning
Abstract:More and more, large-scale pre-trained models show apparent advantages in solving the event detection (ED), i.e., a task to solve the problem of event classification by identifying trigger words. However, this kind of model depends heavily on labeled training data. Unfortunately, there is not enough such data for some particular areas, such as finance, due to the high cost of the data annotation process. Besides, the manually labeled training data has many problems like uneven sampling distribution, poor diversity, and massive long-tail data. Recently, some researchers have used the generative model to label data. However, training the generative models needs rich domain knowledge, which cannot be obtained from a Few-Shot resource. Therefore, we propose a Domain-Aware Few-Shot (DAFS) generative model that can generate domain based training data through a relatively small amount of labeled data. First, DAFS utilizes self-supervised information from various categories of sentences to calculate words' transition probability under different domain and retain key triggers in each sentence. Then, we apply our joint algorithm to generate labeled training data that considers both diversity and effectiveness. Experimental results demonstrate that the training data generated by DAFS significantly improves the performance of ED in actual financial data. Especially when there are no more than 20 training data, DAFS can still ensure the generative quality to a certain extent. It also obtains new state-of-the-art results on ACE2005 multilingual corpora.
computer science, artificial intelligence
What problem does this paper attempt to address?