A general and modular framework for dark web analysis

José Manuel Ruiz Ródenas,Javier Pastor-Galindo,Félix Gómez Mármol
DOI: https://doi.org/10.1007/s10586-023-04189-2
2023-12-07
Cluster Computing
Abstract:The dark web, often linked with illegal activities, can be monitored with different solutions. However, these tools are typically purpose-specific and designed for unique use cases. In this study, we propose a flexible and scalable framework that facilitates the easy integration of new workflows for dark web analysis. The design is based on the control, logic and operations layers, supplemented by a tools module, logs management, asynchronous message-based communication and a database. The implementation maps the features into a microservice approach, utilizing the open-source technologies Docker Swarm, Kafka, ELK Stack (Elastic Search, Logstash and Kibana), and PostgreSQL. A workflow to scrape web elements of Tor onion services is deployed and validated, demonstrating considerable framework performance despite the time-consuming task of navigating the dark web. Over 16 h, the framework collected over half million onion domains (84,371 unique ones) and made 78,555 accesses to them.
computer science, information systems, theory & methods
What problem does this paper attempt to address?