Loklak - A Distributed Crawler and Data Harvester for Overcoming Rate Limits

Sudheesh Singanamalla,Michael Peter Christen
DOI: https://doi.org/10.48550/arXiv.1704.03624
2017-04-12
Information Retrieval
Abstract:Modern social networks have become sources for vast quantities of data. Having access to such big data can be very useful for various researchers and data scientists. In this paper we describe Loklak, an open source distributed peer to peer crawler and scraper for supporting such research on platforms like Twitter, Weibo and other social networks. Social networks such as Twitter and Weibo pose various limitations to the user on the rate at which one could freely collect such data for research. Our crawler enables researchers to continuously collect data while overcoming the barriers of authentication and rate limits imposed to provide a repository of open data as a service.
What problem does this paper attempt to address?