Document Keyword Extraction Based on Semantic Hierarchical Graph Model

Tingting Zhang,Baozhen Lee,Qinghua Zhu,Xi Han,Ke Chen
DOI: https://doi.org/10.1007/s11192-023-04677-7
IF: 3.801
2023-01-01
Scientometrics
Abstract:Keyword provide a brief profile of document contents and serve as an important method for quickly obtaining the document's themes. Traditional keyword extraction methods are mostly based on statistical relationships between words, with no deeper understanding of the words' structures. In addition, most studies to date performing keyword extraction are based on ranking-related measure values, without considering the cohesion of the extracted keyword set. In this paper, a keyword extraction method based on a semantic hierarchical graph model is proposed. First, the semantic graph for the document is constructed based on the hierarchical extraction of feature terms. Then, the keyword collection of the document is chosen from the constructed semantic graph. The keyword extraction method in this paper fully accounts for both the context of the keywords and the internal structure by which they are related. By mining the deep hidden structure of feature terms, the proposed method can effectively reveal the hierarchical association between terms within the semantic graph and obtain a keyword collection result with high probability. Moreover, several experiments conducted on released datasets show that our method outperforms the existing methods in terms of precision, recall, and F-measure.
What problem does this paper attempt to address?