Abstract:Query‐focused multi‐document summarization (Qf‐MDS) is a sub‐task of automatic text summarization that aims to extract a substitute summary from a document cluster of the same topic and based on a user query. Unlike other summarization tasks, Qf‐MDS has specific research challenges including the differences and similarities across related document sets, the high degree of redundancy inherent in the summaries created from multiple related sources, relevance to the given query, topic diversity in the produced summary and the small source‐to‐summary compression ratio. In this work, we propose a semantic diversity feature based query‐focused extractive summarizer (SDbQfSum) built on powerful text semantic representation techniques underpinned with Wikipedia commonsense knowledge in order to address the query‐relevance, centrality, redundancy and diversity challenges. Specifically, a semantically parsed document text is combined with knowledge‐based vectorial representation to extract effective sentence importance and query‐relevance features. The proposed monolingual summarizer is evaluated on a standard English dataset for automatic query‐focused summarization tasks, that is, the DUC2006 dataset. The obtained results show that our summarizer outperforms most state‐of‐the‐art related approaches on one or more ROUGE measures achieving 0.418, 0.092 and 0.152 in ROUGE‐1, ROUGE‐2, and ROUGE‐SU4 respectively. It also attains competitive performance with the slightly outperforming system(s), for example, the difference between our system's result and best system in ROUGE‐1 is just 0.006. We also found through the conducted experiments that our proposed custom cluster merging algorithm significantly reduces information redundancy while maintaining topic diversity across documents.

Multi-document Summarization Based on Sentence Clustering

Co-clustering Sentences and Terms for Multi-document Summarization

Mining Both Commonality and Specificity From Multiple Documents for Multi-Document Summarization

Clustering Sentences with Density Peaks for Multi-document Summarization

Multi-Document Summarization via Discriminative Summary Reranking

Subtopic-based Multi-Document Summarization

Multi-document summarization using cluster-based link analysis.

SgSum: Transforming Multi-document Summarization into Sub-graph Selection

Multi-Document Summarization Based On Two-Level Sparse Representation Model

Automatic summarization model based on clustering algorithm

Using a Double Clustering Approach to Build Extractive Multi-document Summaries

CollabSum: exploiting multiple document clustering for collaborative single document summarizations.

Automatic multi-document summarization based on new sentence similarity measures

A Supervised Aggregation Framework for Multi-Document Summarization.

Automatic Topic-oriented Multi-document Summarization with Combination of Query-dependent and Query-independent Rankers

An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model.

Document Summarization Via Self-Present Sentence Relevance Model.

SDbQfSum: Query‐focused summarization framework based on diversity and text semantic analysis

Automatic Text Summarization Method Based on Improved TextRank Algorithm and K-Means Clustering

Multi-document Summarization by Information Distance