File Fragment Classification Using Grayscale Image Conversion and Deep Learning in Digital Forensics

Qian Chen,Qing Liao,Zoe L. Jiang,Junbin Fang,Siuming Yiu,Guikai Xi,Rong Li,Zhengzhong Yi,Xuan Wang,Lucas C. K. Hui,Dong Liu,En Zhang
DOI: https://doi.org/10.1109/spw.2018.00029
2018-01-01
Abstract:File fragment classification is an important step in digital forensics. The most popular method is based on traditional machine learning by extracting features like N-gram, Shannon entropy or Hamming weights. However, these features are far from enough to classify file fragments. In this paper, we propose a novel scheme based on fragment-to-grayscale image conversion and deep learning to extract more hidden features and therefore improve the accuracy of classification. Benefit from the multi-layered feature maps, our deep convolution neural network (CNN) model can extract nearly ten thousands of features through the non-linear connections between neurons. Our proposed CNN model was trained and tested on the public dataset GovDocs. The experiments results show that we can achieve 70.9% accuracy in classification, which is higher than those of existing works.
What problem does this paper attempt to address?