BotGraph: Web Bot Detection Based on Sitemap

Luo Yang,She Guozhen,Huang Jinwan,Cheng Peng,Xiong Yongqiang
2019-01-01
Abstract: The web bots have been blamed for consuming large amount of Internet traffic and undermining the interest of the scraped sites for years. Traditional bot detection studies focus mainly on signature-based solution, but advanced bots usually forge their identities to bypass such detection. With increasing cloud migration, cloud providers provide new opportunities for an effective bot detection based on big data to solve this issue. In this paper, we present a behavior-based bot detection scheme called BotGraph that combines sitemap and convolutional neural network (CNN) to detect inner behavior of bots. Experimental results show that BotGraph achieves ~95% recall and precision on 35-day production data traces from different customers including the Bing search engine and several sites.
What problem does this paper attempt to address?