An APT Malware Classification Method Based on Adaboost Feature Selection and LightGBM

Na Xu,Shudong Li,Xiaobo Wu,Weihong Han,Xiaojing Luo
DOI: https://doi.org/10.1109/dsc53577.2021.00101
2021-10-01
Abstract:Advanced Persistent Threat (APT) attack activities with the theme of COVID-19 and vaccine are also growing rapidly. The target of APT attack has gradually expanded from government agencies to vaccine manufacturers, medical industry and so on. What's more, APT groups have a strict organizational structure and professional division of labor and malware delivered by the same APT groups are similar. Classifying malware samples into known APT groups in time can minimize losses as soon as possible and keep relevant industries vigilant. In our paper, we proposed a multi-classification method of APT malware based on Adaboost and LightGBM. We collect real APT malware samples that have been delivered by 12 known APT groups. The API call sequence of each APT malware is obtained through the sandbox. For the relationship between adjacent APIs, we use TF-IDF algorithm combined with bi-gram. Then, Adaboost algorithm is used to select out the important API features, which form the target feature subset. Finally, we use the above subset combined with LightGBM ensemble algorithm to train multiple classifiers, named Ada-LightGBM. The experimental results show that our method is superior to the single Adaboost and LightGBM method. The classifier has good recognition performance for the test samples.
What problem does this paper attempt to address?