Semantically Improved Automatic Keyphrase Extraction

方俊,郭雷,王晓东
DOI: https://doi.org/10.3969/j.issn.1002-137x.2008.06.039
2008-01-01
Computer Science
Abstract:Keyphrases provide semantic metadata producing an overview of the content of a document, they are used in many text-mining applications. In the process of keyphrases generation, we notice that the distinction between lexical level (term for meaning) and conceptual level (the meaning itself) can result in inaccuracy. In order to solve this problem, this paper proposes a new method that improves automatic keyphrase extraction by using semantic information of candidate keyphrases. Our keyphrases extraction method, in contrast to current methods, outputs the senses set instead of terms set by using word sense disambiguation method, as sense has only one unique meaning. Semantic relatedness between senses of candidate keyphrases is taken into consideration in the stage of term conflation, feature calculation, and evaluation. We evaluate our semantically improved method against the well known Kea system by using a more effective semantically enhanced evaluation method. The inter-domain experiment shows that quality of keyphrases extraction can be improved significantly when semantic information is exploited. The intra-domain experiment shows our method is competitive with Kea++ algorithm, and not domain-specific.
What problem does this paper attempt to address?