DMSeqNet-mBART: A State-of-the-Art Adaptive-DropMessage Enhanced Mbart Architecture for Superior Chinese Short News Text Summarization
Kangjie Cao,Weijun Cheng,Yiya Hao,Yichao Gan,Ruihuan Gao,Junxu Zhu,Jinyao Wu
DOI: https://doi.org/10.1016/j.eswa.2024.125095
IF: 8.5
2024-01-01
Expert Systems with Applications
Abstract:Mandarin Chinese, a globally prevalent language, boasts an abundance of regularly refreshed short news texts accessible online. Consequently, devising concise summaries of these texts has emerged as a pivotal challenge for enhancing information dissemination and comprehension efficiency. To tackle this issue, we introduce DMSeqNet-mBART, an innovative model grounded in the mBART framework, positioning it as a state-of-theart solution for Chinese short news summarization. DMSeqNet-mBART incorporates the Adaptive-DropMessage technique, an innovative methodology that intelligently discards or retains information contingent upon the attention mechanism's output. Furthermore, the model integrates several enhanced technologies, such as dynamic convolutional layers, gated residual connections, customized feed-forward networks enhanced with batch normalization, self-attention, and cross-attention, all aimed at bolstering the performance and robustness of Chinese short news summarization. Rigorous comparative experiments, conducted across six recognized Chinese short news summary datasets, demonstrate that DMSeqNet-mBART significantly surpasses industry- leading models like T5, MLC, PLCC, and GPT-4 in terms of fluency, completeness, robustness, and accuracy. These results, as validated by benchmarks including BERTScore, BLEU, and ROUGE metrics, underscore the model's superiority across diverse evaluation standards.