Employing Honeynets For Network Situational Awareness

P. Barford,Y. Chen,A. Goyal,Z. Li,V. Paxson,V. Yegneswaran
DOI: https://doi.org/10.1007/978-1-4419-0140-8_5
2010-01-01
Abstract:Effective network security administration depends to a great extent on having accurate, concise, high-quality information about malicious activity in one's network. Honeynets can potentially provide such detailed information, but the volume and diversity of this data can prove overwhelming. We explore ways to integrate honeypot data into daily network security monitoring with a goal of sufficiently classifying and summarizing the data to provide ongoing "situational awareness." We present such a system, built using the Bro network intrusion detection system coupled with statistical analysis of numerous honeynet "events", and discuss experiences drawn from many months of operation. In particular, we develop methodologies by which sites receiving such probes can infer-using purely local observation-in format ion about the probing activity: What scanning strategies does the probing employ? Is this an attack that specifically targets the site, or is the site only incidentally probed as part of a larger, indiscriminant attack? One key aspect of this environment is its ability to provide insight into large-scale events. We look at the problem of accurately classifying botnet sweeps and worm outbreaks, which turns out to be difficult to grapple with due to the high dimensionality of such incidents. Using datasets collected during a number of these events, we explore the utility of several analysis methods, finding that when used together they show good potential for contributing towards effective Situational awareness. Our analysis draws upon extensive honeynet data to explore the prevalence of different types of scanning, including properties, such as trend, uniformity, coordination, and darknet-avoidance. In addition, we design schemes to extrapolate the global properties of scanning events (e.g., total population and target scope) as inferred from the limited local view of a honeynet. Cross-validating with data from DShield shows that such inferences exhibit promising accuracy.
What problem does this paper attempt to address?