Training with One2MultiSeq: CopyBART for social media keyphrase generation
Bengong Yu,Chunyang Gao,Shuwen Zhang,Zhang, Shuwen
DOI: https://doi.org/10.1007/s11227-024-06050-8
IF: 3.3
2024-04-05
The Journal of Supercomputing
Abstract:Keyphrase generation, which can help people obtain key information from a long document (social media posts or scientific articles) in a short time, has made significant progress in recent years, especially for training by concatenating keyphrases with a predefined order. However, when using beam search for keyphrase generation, models tend to repeatedly generate the highest priority keyphrase type in each beam branch, which causes the model to weaken the generation performance on the underdog keyphrase type. To tackle this, we introduce the One2MultiSeq paradigm, which allows the model to train with two sets of keyphrases that have completely opposite connection orders. Moreover, given that social media content is often colloquial, informal, and multimodal (comprising not just text but also images), these properties necessitate the incorporation of a priori knowledge for models to effectively process such information. However, contemporary models lack this requisite capacity, thereby limiting their ability to proficiently handle these discrete data elements. To overcome this, we incorporate the pretrained model BART as our backbone architecture and employ a copy mechanism to further augment its keyphrase generation capabilities. Experimental results show that our method outperformed relatively advanced models, with gains of 3.51, 1.55, and 2.47 percentage points in F1@1, F1@3, and MAP@5, respectively, on the unimodal Twitter dataset; 3.23, 2.68, and 4.07 on the multimodal Tweet dataset; and increases of 4.32, 0.32, and 7.07 in F1@3, F1@5, and MAP@5, respectively, on the StackExchange dataset.
computer science, theory & methods,engineering, electrical & electronic, hardware & architecture