Machine Learning for Keyphrases Extraction Based on Naive Bayesian Classifier

Jiabing Wang,Hong Peng,Jingsong Hu
DOI: https://doi.org/10.1109/iccias.2006.294249
2006-01-01
Abstract:Keyphrase extraction is a task with many applications in information retrieval, text mining, and natural language processing. In this paper a keyphrase extraction approach based on the naive Bayesian classifier is proposed. To determine whether a phrase is a keyphrase, the following features of a phrase in a given document are adopted: its term frequency, whether to appear in the title, abstract and headings (subheadings), and its frequency appearing in the paragraphs of the given document. The approach is evaluated by the standard information retrieval metrics of precision and recall. Experiment results show that this approach is very practical: it can achieve high precision and recall; especially the recall it can achieve is over 80 percent.
What problem does this paper attempt to address?