De-obfuscation and Detection of Malicious PDF Files with High Accuracy

Xun Lu,Jianwei Zhuge,Ruoyu Wang,Yinzhi Cao,Yan Chen
DOI: https://doi.org/10.1109/hicss.2013.166
2013-01-01
Abstract:Due to its high popularity and rich functionalities, the Portable Document Format (PDF) has become a major vector for malware propagation. To detect malicious PDF files, the first step is to extract and de-obfuscate Java Script codes from the document, for which an effective technique is yet to be created. However, existing static methods cannot de-obfuscate Java Script codes, existing dynamic methods bring high overhead, and existing hybrid methods introduce high false negatives. Therefore, in this paper, we present MPScan, a scanner that combines dynamic Java Script de-obfuscation and static malware detection. By hooking the Adobe Reader's native Java Script engine, Java Script source code and op-code can be extracted on the fly after the source code is parsed and then executed. We also perform a multilevel analysis on the resulting Java Script strings and op-code to detect malware. Our evaluation shows that regardless of obfuscation techniques, MPScan can effectively de-obfuscate and detect 98% malicious PDF samples.
What problem does this paper attempt to address?