A spatiotemporal clustering approach improved by topic burstiness for event detection

ZhuMin Chen,Guanghui Wang,Jingsheng Lei,Jun Ma
2012-01-01
Journal of Information and Computational Science
Abstract:Document clustering plays an increasing significant role with the exponential growth of documents on the Web. Many approaches have been proposed to solve clustering problem and work well on static data sets. However, most of them create clusters only based on the text semantic distance between the elements without considering their temporal and spatial relationship. In addition, almost all existing methods aim to find all topic groups present in a news collection. But it is extremely difficult to coverage all topic groups exactly. In this paper, we propose a novel spatiotemporal clustering approach for Web news. We first utilize the temporal burst characteristic of topics to automatically and quickly predict an optimal number of groups to be clustered. Then, we extend the traditional distance measure to be a novel function which utilizes the temporal, spatial and content information of the documents. Finally, we outline an algorithmic on the basis of above two steps. We collect the real Web data set and demonstrate through a series of experiments that our method significantly outperforms the baseline clustering methods. © 2012 by Binary Information Press.
What problem does this paper attempt to address?