Data Crawling and Research Based on Topic Web Crawler

Zhongsheng Wang,Jieyi Lv
DOI: https://doi.org/10.1109/ICCNEA57056.2022.00049
2022-09-01
Abstract:With the popularity of big data, efficient acquisition of existing massive data and multi-angle analysis has become a key technology. In this paper, compared with the traditional general web crawler, the main web crawler strategy adopted in network crawling can be more efficient for grasping targets, so as to carry out data grasping operations more efficiently. Based on the film review data on Douban.com, this paper analyzes and studies the film review data without damaging the operation of the website. By using jieba, matplotlib, wordcloud and other libraries in python library, data can be visualized into wordcloud map, pie chart and line chart, which is helpful for users to see key words and the proportion of favorable and negative comments in movie works directly, which is of great significance to users’ preference selection. It also has certain reference significance for accurate software recommendation in the era of big data.
Computer Science
What problem does this paper attempt to address?