Topic Decomposition and Summarization.

Wei Chen,Can Wang,Chun Chen,Lijun Zhang,Jiajun Bu
DOI: https://doi.org/10.1007/978-3-642-13657-3_47
2010-01-01
Abstract:In this paper, we study topic decomposition and summarization for a temporal-sequenced text corpus of a specific topic. The task is to discover different topic aspects (i.e., sub-topics) and incidents related to each sub-topic of the text corpus, and generate summaries for them. We present a solution with the following steps: (1) deriving sub-topics by applying Non-negative Matrix Factorization (NMF) to terms-by-sentences matrix of the text corpus; (2) detecting incidents of each sub-topic and generating summaries for both sub-topic and its incidents by examining the constitution of its encoding vector generated by NMF; (3) ranking each sentences based on the encoding matrix and selecting top ranked sentences of each sub-topic as the text corpus' summary. Experimental results show that the proposed topic decomposition method can effectively detect various aspects of original documents. Besides, the topic summarization method achieves better results than some well-studied methods.
What problem does this paper attempt to address?