Learning to extract coherent keyphrases from online news

Zhuoye Ding,Qi Zhang,Xuanjing Huang
DOI: https://doi.org/10.1007/978-3-642-25631-8_43
2011-01-01
Abstract:Keyphrases extracted from news articles can be used to concisely represent the main content of news events. In this paper, we first present several criteria of high-quality news keyphrases. After that, in order to integrate those criteria into the keyphrase extraction task, we propose a novel formulation which coverts the task to a learning to rank problem. Our approach involves two phases: selecting candidate keyphrases and ranking all possible sub-permutations among the candidates. Three kinds of feature sets: lexical feature set, locality feature set and coherence feature set are introduced to rank the candidates, and then the best sub-permutation provides the keyphrases. The proposed method is evaluated on a multi-news collection and experimental results verify that our proposed method is effective to extract coherent news keyphrases.
What problem does this paper attempt to address?