Semi-supervised co-clustering for query-oriented theme-based summarization

Libin Yang,Xiaoyan Cai
2012-01-01
Abstract:Sentence clustering plays an important role in theme-based summarization which aims to discover the topical themes defined as the clusters of highly related sentences. However, due to the short length of sentences, the word-vector cosine similarity traditionally used for document clustering is no longer suitable. To alleviate this problem, we regard a word as an independent text object rather than a feature of the sentence and develop a noise detection enhanced co-clustering framework to cluster sentences and words simultaneously. We also explore a semi-supervised clustering approach to make the generated summary biased towards the given query. The evaluation conducted on the three DUC query-oriented summarization datasets demonstrates the effectiveness of the approaches. © Maxwell Scientific Organization, 2012.
What problem does this paper attempt to address?