A Modular Approach to Automatic Cyber Threat Attribution using Opinion Pools

Koen T.W. Teuwen
DOI: https://doi.org/10.1109/BigData59044.2023.10386708
2024-01-25
Abstract:Cyber threat attribution can play an important role in increasing resilience against digital threats. Recent research focuses on automating the threat attribution process and on integrating it with other efforts, such as threat hunting. To support increasing automation of the cyber threat attribution process, this paper proposes a modular architecture as an alternative to current monolithic automated approaches. The modular architecture can utilize opinion pools to combine the output of concrete attributors. The proposed solution increases the tractability of the threat attribution problem and offers increased usability and interpretability, as opposed to monolithic alternatives. In addition, a Pairing Aggregator is proposed as an aggregation method that forms pairs of attributors based on distinct features to produce intermediary results before finally producing a single Probability Mass Function (PMF) as output. The Pairing Aggregator sequentially applies both the logarithmic opinion pool and the linear opinion pool. An experimental validation suggests that the modular approach does not result in decreased performance and can even enhance precision and recall compared to monolithic alternatives. The results also suggest that the Pairing Aggregator can improve precision over the linear and logarithmic opinion pools. Furthermore, the improved k-accuracy in the experiment suggests that forensic experts can leverage the resulting PMF during their manual attribution processes to enhance their efficiency.
Cryptography and Security,Machine Learning,Software Engineering
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to improve the automation degree and accuracy of cyber threat attribution, while enhancing the interpretability and modularity of the system. Specifically, the author believes that most of the current threat attribution methods adopt a monolithic architecture, which is difficult to adapt, reuse, and not flexible enough when dealing with complex problems. Therefore, the paper proposes a modular architecture based on opinion pools to overcome the limitations of existing methods. ### Main contributions of the paper: 1. **Modular architecture**: Decompose the threat attribution problem into multiple sub - problems, and each sub - problem is processed by an independent attributor. These modules can be independently developed and optimized, and then the results are aggregated through the opinion pool. 2. **Application of opinion pools**: Introduce the linear opinion pool and the logarithmic opinion pool, and propose a new Pairing Aggregator. This aggregator improves the accuracy and robustness of attribution by combining the outputs of different types of attribution modules. 3. **Performance verification**: Through experiments, it is verified that the modular method does not reduce the classification performance, and in some cases can improve the precision and recall. ### Key formulas: - **Linear opinion pool**: \[ g_{\text{linear}}[q_1,\ldots,q_K](\theta)=\sum_{k = 1}^{K}w_kq_k(\theta) \] - **Logarithmic opinion pool**: \[ g_{\text{logarithmic}}[q_1,\ldots,q_K](\theta)=c\prod_{k = 1}^{K}q_k(\theta)^{w_k} \] where \(c\) is a normalization factor. ### Experimental design: To evaluate the effectiveness of the modular method, the author uses an artificial dataset containing 8 non - stationary features to generate 392,577 events, which are sourced from 128 generated threat actor profiles. The experiment compares the performance of the modular method with that of a single complex XGBoost classifier (as a baseline), and the results show that the modular method not only does not reduce the performance, but also improves in some aspects. ### Conclusion: By proposing a modular architecture and a pairing aggregator, this paper solves the problems of insufficient flexibility and poor interpretability in existing threat attribution methods. The experimental results show that the modular method not only maintains or even improves the classification performance, but also enhances the interpretability and modularity of the system. This provides new ideas and tools for future research and practical applications.