Abstract:Due to the rapidly increasing number of Internet-connected objects, a huge amount of data is created, stored, and shared. Depending on the use case, this data is visualized, cleaned, checked, visualized, and processed for various purposes. However, this data may encounter many problems such as inaccuracy, duplication, absence, etc. Such issues can be regarded as anomalies that deviate from a referential point, which can be caused by malicious attackers, abnormal behavior of systems, and a failure of devices, transmission channels, or data processing units. Anomaly detection is still one of the most important issues in cybersecurity, especially when it comes to system monitoring, automated forensics, and post-mortem analysis, which require anomaly detection mechanisms. In the literature, different approaches have been developed to detect anomalies, which can be classified as statistic-based, semantic-based, clustering-based, classification-based, and deep learning-based, depending on the algorithms used. This survey focuses on knowledge-based approaches, a sub-category of semantic-based approaches, as opposed to statistical/learning approaches. We provide a detailed comparison of the recent work in knowledge-based subcategories, namely, rule-based, score-based, and hybrid. We described the components of a knowledge-based system and the steps required to process raw data for anomaly detection. Furthermore, we have collected for each approach, when available, information about its semantic expressiveness, computational complexity, and application domain. Finally, we identify the challenges and discuss some future research directions in knowledge-based anomaly detection. Identifying such approaches and challenges can help cybersecurity engineers design better models that meet their application requirements.

KAD: a knowledge formalization-based anomaly detection approach for distributed systems

Accurate Anomaly Detection Leveraging Knowledge-enhanced GAT

TKS-BLS: Temporal Kernel Stationary Broad Learning System for Enhanced Modeling, Anomaly Detection, and Incremental Learning with Application to Ironmaking Processes

KG-ADS: A Log Anomaly Detection Assisted Decision-Making with Support of Knowledge Graph and Reinforcement Learning

Remembering Normality: Memory-guided Knowledge Distillation for Unsupervised Anomaly Detection

Pre-trained KPI Anomaly Detection Model Through Disentangled Transformer

KDDT: Knowledge Distillation-Empowered Digital Twin for Anomaly Detection

Distributed system anomaly detection using deep learning‐based log analysis

Detection of Cluster Anomalies With ML Techniques

Weakly Supervised Anomaly Detection via Knowledge-Data Alignment

Anomaly detection of unstructured big data via semantic analysis and dynamic knowledge graph construction

DKADE: a novel framework based on deep learning and knowledge graph for identifying adverse drug events and related medications

AutoKAD: Empowering KPI Anomaly Detection with Label-Free Deployment

An Integrated Method for Anomaly Detection From Massive System Logs.

A Methodological Report on Anomaly Detection on Dynamic Knowledge Graphs

Knowledge Distillation-Empowered Digital Twin for Anomaly Detection

Online Detection of Anomalies in Temporal Knowledge Graphs with Interpretability

An Empirical Investigation of Practical Log Anomaly Detection for Online Service Systems.

ADA: Adaptive Deep Log Anomaly Detector

Knowledge-based anomaly detection: Survey, challenges, and future directions

Data-Driven Root-Cause Analysis For Distributed System Anomalies