Towards a relation extraction framework for cyber-security concepts

Corinne L. Jones,Robert A. Bridges,Kelly Huffer,John Goodall
DOI: https://doi.org/10.48550/arXiv.1504.04317
2015-04-16
Information Retrieval
Abstract:In order to assist security analysts in obtaining information pertaining to their network, such as novel vulnerabilities, exploits, or patches, information retrieval methods tailored to the security domain are needed. As labeled text data is scarce and expensive, we follow developments in semi-supervised Natural Language Processing and implement a bootstrapping algorithm for extracting security entities and their relationships from text. The algorithm requires little input data, specifically, a few relations or patterns (heuristics for identifying relations), and incorporates an active learning component which queries the user on the most important decisions to prevent drifting from the desired relations. Preliminary testing on a small corpus shows promising results, obtaining precision of .82.
What problem does this paper attempt to address?