Multi-Strategy Based Sina Microblog Data Acquisition For Opinion Mining

Xiao Sun,Jia-qi Ye,Fuji Ren
DOI: https://doi.org/10.1007/978-3-319-09339-0_56
2014-01-01
Abstract:As an important media for social interactions and information dissemination through the internet, Sina microblog contains emotional state and important opinion of participants. Dealing with microblog data belongs to big data areas, the premise of which is to obtain a large amount of microblog data for further analysis and data mining. For commercial interests as well as security considerations, the access to the data is becoming increasingly difficult and the API Sina microblog officially provided doesn't support large amount of data mining. In this paper, we try to design a platform that is mainly based on the access mechanism of multistrategy and existing resources to collect data stably from Sina microblog. The results demonstrate that a combination of API and web crawler allows efficient data mining. In such way, sentiment analysis and opinion mining are performed on the data obtained by the multi-strategy method, which proved that the proposed solutions will be allowed to build straightforward application of hot words searching, opinion mining and sentiment analysis.
What problem does this paper attempt to address?