Automatically Discovering Surveillance Devices in the Cyberspace

Qiang Li,Xuan Feng,Haining Wang,Limin Sun
DOI: https://doi.org/10.1145/3083187.3084020
2017-01-01
Abstract:Surveillance devices with IP addresses are accessible on the Internet and play a crucial role in monitoring physical worlds. Discovering surveillance devices is a prerequisite for ensuring high availability, reliability, and security of these devices. However, today's device search depends on keywords of packet head fields, and keyword collection is done manually, which requires enormous human efforts and induces inevitable human errors. The difficulty of keeping keywords complete and updated has severely impeded an accurate and large-scale device discovery. To address this problem, we propose to automatically generate device fingerprints based on webpages embedded in surveillance devices. We use natural language processing to extract the content of webpages and machine learning to build a classification model. We achieve real-time and non-intrusive web crawling by leveraging network scanning technology. We implement a prototype of our proposed discovery system and evaluate its effectiveness through real-world experiments. The experimental results show that those automatically generated fingerprints yield very high accuracy of 99% precision and 96% recall. We also deploy the prototype system on Amazon EC2 and search surveillance devices in the whole IPv4 space (nearly 4 billion). The number of devices we found is almost 1.6 million, about twice as many as those using commercial search engines.
What problem does this paper attempt to address?