Comments-Oriented Summarization in Blogsphere Using a Two-Stage Sentence Similarity Measure

Hongjie Li,Lifu Huang,Qifeng Fan,Lian’en Huang
DOI: https://doi.org/10.1007/978-3-319-08010-9_52
2014-01-01
Abstract:The popularity of Web 2.0 applications promotes the emergence of user generated content (UGC), e.g., the comments in blogsphere, and the UGC reflects the viewpoints of web users towards a specific event or product. In this paper, we propose a summarization model which applies a novel sentence similarity measure. In the proposed two-stage similarity measure, we utilize a learning method based on an optimization perspective to combine different types of similarity for a refined similarity measure. Both standard cosine similarity and topic based similarity measure are explored to compute the preliminary similarity. In order to evaluate the novel similarity measure, we conduct experiments on a real-world blog data set and the result proves the effectiveness of our proposed method.
What problem does this paper attempt to address?