A Review on Cyber Security Named Entity Recognition
Chen Gao,Xuan Zhang,Mengting Han,Hui Liu
DOI: https://doi.org/10.1631/fitee.2000286
IF: 2.526
2021-01-01
Frontiers of Information Technology & Electronic Engineering
Abstract:With the rapid development of Internet technology and the advent of the era of big data, more and more cyber security texts are provided on the Internet. These texts include not only security concepts, incidents, tools, guidelines, and policies, but also risk management approaches, best practices, assurances, technologies, and more. Through the integration of large-scale, heterogeneous, unstructured cyber security information, the identification and classification of cyber security entities can help handle cyber security issues. Due to the complexity and diversity of texts in the cyber security domain, it is difficult to identify security entities in the cyber security domain using the traditional named entity recognition (NER) methods. This paper describes various approaches and techniques for NER in this domain, including the rule-based approach, dictionary-based approach, and machine learning based approach, and discusses the problems faced by NER research in this domain, such as conjunction and disjunction, non-standardized naming convention, abbreviation, and massive nesting. Three future directions of NER in cyber security are proposed: (1) application of unsupervised or semi-supervised technology; (2) development of a more comprehensive cyber security ontology; (3) development of a more comprehensive deep learning model.