Combining N-Gram and Dependency Word Pair for Multi-document Summarization

Yungang Ma,Ji Wu
DOI: https://doi.org/10.1109/cse.2014.39
2014-01-01
Abstract:This paper proposes a method for extractive multi-document summarization based on the combined features of n-grams co-occurrences and dependency word pairs co-occurrences. Unigram is the basic text unit, Big ram and skip-big ram reflect the word sequential relationships in the sentences, Dependency word pairs describe the syntactic relationships between words. The co-occurrences of each feature reflect the common topics of multiple documents in different perspective. The score of a sentence is the weighted sum of the features it contains. The summary is generated by extracting salient sentences based on the maximum significance score model. This approach obtains higher ROUGE scores than several well-known methods on the TAC summarization dataset.
What problem does this paper attempt to address?