Development and Analysis of a Reconnaissance-Technique Knowledge Graph

Thomas Heverin,Elsa Deitz,Eve Cohen,Jordana Wilkes
DOI: https://doi.org/10.34190/iccws.18.1.1041
2023-02-28
International Conference on Cyber Warfare and Security
Abstract:Penetration testing involves the use of many tools and techniques. The first stage of penetration testing involves conducting reconnaissance on a target organization. In the reconnaissance phase, adversaries use tools to find network data, people data, company/organization data, and attack data to generate a risk assessment about a target to determine where initial weaknesses may be. Although a small number of tools can be used to conduct many of reconnaissance tasks, including Shodan, Nmap, Recon-ng, Maltego, Metasploit, Google and more, each tool holds an abundance of specific techniques that can be used. Furthermore, each technique uses unique syntax. For example, Nmap holds over 600 scripts that make up its Nmap Scripting Engine. Depending on the type of device targeted, Nmap scripts can scan for ports, operating systems, IP addresses, hostnames and more. As another example, Maltego operates over 150 transforms or modules that collect data on organizations, files and people. Understanding which reconnaissance tool, techniques within those tools, and the syntax for each technique represents a highly complex task. MITRE ATT&CK, a widely accepted framework, models tactics and techniques within the tactics to help users make sense of adversarial behaviours. The tactic of reconnaissance is modelled in ATT&CK as well as its techniques. However, the explicit links between reconnaissance techniques are not modelled. Our research focused on the development of an ontology called Recontology to model the domain of reconnaissance. Recontology was then used to form Reconnaissance-Technique Graph (RT-Graph) to model 102 reconnaissance techniques and the directional links between the techniques. We used exploratory data analysis (EDA) methods including a graph spatial-layout algorithm and several graph-statistical algorithms to examine RT-Graph. We also used EDA to find critical techniques within the graph. Patterns across the results are discussed as well as implications for real-world uses of RT-Graph.
What problem does this paper attempt to address?