MalAF : Malware Attack Foretelling from Run-Time Behavior Graph Sequence

Chen Liu,Bo Li,Jun Zhao,Xudong Liu,Chunpei Li
DOI: https://doi.org/10.1109/tdsc.2023.3298905
2024-01-01
IEEE Transactions on Dependable and Secure Computing
Abstract:Foretelling ongoing malware attacks in real time is challenging due to the stealthy and polymorphic nature of their executive behavior patterns. In this paper, we present MalAF, a novel Mal ware A ttack F oretelling framework that utilizes run-time behavior (i.e., sequences of API events) of malware to foretell the attack that has not yet executed. MalAF first samples suspicious API events by assessing the sensitivity of the parameters of each API event and dividing them into multiple attack time slots by calculating the strong correlation. Following that, MalAF employs dynamic heterogeneous graph sequences to incrementally model contextual semantics for each attack time slot, generating malware state sequences in real time. Moreover, MalAF proposes a greedy adaptive dictionary (GAD)-optimized IRL preference learning method to automate the capture of families' intrinsic attack preferences, which achieves higher performance than the existing inverse reinforcement learning (IRL). Additionally, with the guidance of families' attack preferences, MalAF trains an LSTM to foretell the future path of the target malware. Finally, MalAF matches the identified APIs' paths with a malicious capability base and reports the comprehensible attacks to an analyst. The experiments on real-world datasets demonstrate that our proposed MalAF outperforms the state-of-the-art methods, which improves the baseline by 3.01% $\sim$ 4.73% of accuracy in terms of path foretell.
What problem does this paper attempt to address?