Improved Automatic Keyword Extraction Given More Semantic Knowledge

Kai Yang,Zhenhong Chen,Yi Cai,Dongping Huang,Ho-Fung Leung
DOI: https://doi.org/10.1007/978-3-319-32055-7_10
2016-01-01
Abstract:Graph-based ranking algorithm such as TextRank shows a remarkable effect on keyword extraction. However, these algorithms build graphs only considering the lexical sequence of the documents. Hence, graphs generated by these algorithm can not reflect the semantic relationships between documents. In this paper, we demonstrate that there exists an information loss in the graph-building process from textual documents to graphs. These loss will lead to the misjudgment of the algorithm. In order to solve this problem, we propose a new approach called Topic-based TextRank. Different from the traditional algorithm, our approach takes the lexical meaning of the text unit (i.e. words and phrase) into account. The result of our experiments shows that our proposed algorithm can outperform the state-of-the-art algorithms.
What problem does this paper attempt to address?