Abstract:Microblogging has become one of the most popular social Web applications in recent years. Posting short messages (i.e., a maximum of 140 characters) to the Web at any time and at any place lowers the usage barrier, accelerates the information diffusion process, and makes it possible for instant publication. Among those daily user-published posts, many are related to recent or real-time events occurring in our daily life. While microblog sites usually display a list of words representing the trend topics during a time period (e.g., 24 hours, a week or even longer) on their homepages, the topical words do not make any sense to let the users have a comprehensive view of the topic, especially for those without any background knowledge. Additionally, users can only open each post in the relevant list to learn the topic details. In this paper, we propose a unified workflow of event detection, tracking and summarization on microblog data. Particularly, we introduce novel features considering the characteristics of microblog data for topical words selection, and thus for event detection. In the tracking phase, a bipartite graph is constructed to capture the relationship between two events occurring at adjacent time. The matched event pair is grouped into an event chain. Furthermore, inspired by diversity theory in Web search, we are the first to summarize event chains by considering the content coverage and evolution over time. The experimental results show the effectiveness of our approach on microblog data.

Unsupervised model for Microblog new words detection based on repeated string

New Word Identification in Social Network Text Based on Time Series Information

New words detection method for microblog text based on integrating of rules and statistics

New words discovery in microblog content

New Words Recognition Algorithm and Application Based on Micro-Blog Hot

Research on Micro-blog New Word Recognition Based on MapReduce.

A Chinese Unknown Word Recognition Method for Micro-Blog Short Text Based on Improved FP-growth.

Feature extension of cluster analysis based on Microblog

A Semi-Supervised Bayesian Network Model for Microblog Topic Classification.

Analysis on new word detection and sentiment orientation in Micro-blog

New Word Detection Using BiLSTM+CRF Model with Features

Real-time Event Detection and Tracking in Microblog via Text Chain and Sentiment time series

Micro-blog topic detection method based on overlap community detection

Chinese Microblog Topic Detection through POS-Based Semantic Expansion

Domain-Specific New Words Detection in Chinese.

Detecting microblog spammers based on reuse detection

An Approach to Named Entity Recognition Towards Micro-Blog

Graph-based Micro-blog Advertisement Text Recognition

Preliminary Study of Chinese Word Segmentation and Part-of-Speech Tagging Being Used for Microblog Data

Towards Effective Event Detection, Tracking And Summarization On Microblog Data

Micro-blogging Hot Words Extraction and Topic Detection