KeyphraseDS: Automatic Generation of Survey by Exploiting Keyphrase Information
Shansong Yang,Weiming Lu,Dezhi Yang,Xi Li,Chao Wu,Baogang Wei
DOI: https://doi.org/10.1016/j.neucom.2016.10.052
IF: 6
2016-01-01
Neurocomputing
Abstract:In this paper, we present a novel document summarization mechanism called KeyphraseDS that can organize the scientific articles into multi-aspect and informative scientific survey by exploiting keyphrases. Keyphrases describe text's salience and central focus, which can serve as the component of aspects under specific topic. KeyphraseDS consists of three steps: keyphrase graph construction, semantic aspect generation and content selection. Keyprhases are firstly extracted through CRF-based model exploiting various features, such as syntactic features, correlation features, etc. Spectral clustering is then performed on keyphrase graph to generate different aspects, where the semantic relatedness between keyphrases is computed through knowledge-based similarity and topic-based similarity. The proposed semantic relatedness can not only utilize the statistical text signals efficiently but also overcome the data sparsity problem. Significant sentences are then selected with respect to the generated aspects through integer linear programming (ILP), which takes semantic relevance, semantic diversity, and keyphrase salience into consideration. Extensive experiments, measured by automatic evaluation and human evaluation, demonstrate the effectiveness of our mechanism for generating scientific survey.