Unsupervised KeyPhrase Extraction Based on Multi-granular Semantics Feature Fusion

Jie Chen,Hainan Hu,Shu Zhao,Yanping Zhang
DOI: https://doi.org/10.1007/978-3-031-50959-9_21
2023-01-01
Abstract:In Unsupervised Keyphrase Extraction (UKE) tasks, candidate phrases are ranked based on their similarity to the document embedding. However, This method assumes that every document focuses on only one topic. As a result, it can be difficult to distinguish the significance of potential keyphrases among different topics. Hence, it is necessary to discover a method for acquiring diversified topic information to obtain accurate key phrases. In this paper, we propose a new unsupervised key phrase extraction method (MSFFUKE) that utilizes multi-granularity semantic feature fusion. We first cluster phrases into different clusters through granulation, calculate the semantic similarity between phrases and each cluster, and take the mean to obtain the semantic features of topic granularity. Then, we obtain semantic features of phrase granularity based on the degree centrality of candidate phrases in the graph structure. Finally, we integrate semantic features of different granularity to sort candidate phrases. Three public benchmarks (Inspec, DUC 2001, SemEval 2010) are used to evaluate our model and compared it to the most advanced models currently available. The results demonstrate that our model performs better than most models and can generalize well when processing input documents from various domains and of different lengths. Another ablation study indicates that both topic granularity semantic features and phrase granularity semantic features are crucial for unsupervised keyphrase extraction tasks.
What problem does this paper attempt to address?