A System to Manage and Mine Microblogging Data.
Zhongying Zhao,Yong Zhang,Chao Li,Li Ning,Jiancong Fan
DOI: https://doi.org/10.3233/jifs-161622
2017-01-01
Journal of Intelligent & Fuzzy Systems
Abstract:Microblogging, an internet-based social media application, is well adopted by people to show their ideas and to exchange their minds. It is also a valuable resource to capture people's interests, thoughts and actions. The contents that people posted are widely spread with distinctive topical feature and contain sentiment-rich information of individuals. In this paper, a microblogging data mining system is developed, which has three functional modules: data preparation, topic analysis and sentiment computing. In data preparation, the microblogging data collecting module is implemented by hadoop and a storm based managing and preprocessing module is designed so that the data is processed effectively and efficiently. For the topic analysis, an LDA based method is adopted to detect topics hidden in microblogs. We also propose a method to calculate the hot degree of each topic and to track the topic evolution. For the sentiment analysis, an integrated method by combining the emotion and the dictionary is introduced to quantify the sentiment. Furthermore, the system is used to make a spatial sentimental analysis on microblogging data from all provinces of China. The sentiment changes with topic or event are also presented with visualization.