General, Efficient, and Real-Time Data Compaction Strategy for APT Forensic Analysis
Tiantian Zhu,Jiayu Wang,Linqi Ruan,Chunlin Xiong,Jinkai Yu,Yaosheng Li,Yan Chen,Mingqi Lv,Tieming Chen
DOI: https://doi.org/10.1109/tifs.2021.3076288
IF: 7.231
2021-01-01
IEEE Transactions on Information Forensics and Security
Abstract:The damage caused by Advanced Persistent Threat (APT) attacks to governments and large enterprises is gradually escalating. Once an attack event is detected, forensic analysis will use the dependencies between system audit logs to rapidly locate intrusion points and determine the impact of the attacks. Due to the high persistence of APT attacks, huge amounts of data will be stored to meet the needs of forensic analysis, which not only brings great storage overhead, but also sharply increases the computing costs. To compact data without affecting forensic analysis, several methods have been proposed. However, in real-world scenarios, we meet the problems of weak cross-platform capability, large data processing overhead, and poor real-time performance, rendering existing data compaction methods difficult to meet the usability and universality requirements jointly. To overcome these difficulties, this paper proposes a general, efficient, and real-time data compaction method at the system log level; it does not involve internal analysis of the program or depend on the specific operating system type, and it includes two strategies: 1) data compaction of maintaining global semantics (GS), which determines and deletes redundant events that do not affect global dependencies, and 2) data compaction based on suspicious semantics (SS). Given that the purpose of forensic analysis is to restore the attack chain, SS performs context analysis on the remaining events from GS and further deletes the parts that are not related to the attack. The results of the real-world experiments show that the compaction ratios of our method to system events are as high as $4.36\times $ to $13.18\times $ and $7.86\times $ to $26.99\times $ on GS and SS, respectively, which is better than state-of-the-art studies.