News Feature Extraction for Events on Social Network Platforms
Peiquan Jin,Lin Mu,Lizhou Zheng,Jie Zhao,Lihua Yue
DOI: https://doi.org/10.1145/3041021.3054151
2017-01-01
Abstract:Microblog-based social network platforms like Twitter and Sina Weibo have been important sources for news event extraction. However, existing works on microblog event extraction, which usually use keywords, entities, or selected microblogs to represent events, are not able to extract details of an event. Based on the view of news report, an event should present detailed news features, i.e., when, where, who, whom, and what. Such news features are helpful for conducting deeply data analysis on microblogs, e.g., competitor monitoring and public crisis discovery. However, the challenge is that the news features of an event on microblogs are usually distributed among different posts because of the short-text property of microblogs. This is much different from extracting news events from Web news pages that usually contain most details of an event. In this paper, we propose a new framework to extract events together with their news features from microblogs. We first extract a set of events from microblogs. Each event is represented as a distribution over four kinds of named entities including location, person name, organization, and time. In addition, the type of each event, i.e., location-related, person-related, or organization-related, is determined by a machine-learning method. In order to obtain the news features of an event, we propose an event-clustering approach that puts together all the relevant events into a cluster. For each cluster, we propose different algorithms to extract the news features of the event reported in the cluster. We conduct experiments on two microblog datasets crawled from a commercial microblogging platform to evaluate the performance of the proposed framework. The results suggest the effectiveness of our proposal.