NLP-based Digital Forensic Investigation Platform for Online Communications

Dongming Sun,Xiaolu Zhang,Kim-Kwang Raymond Choo,Liang Hu,Feng Wang
DOI: https://doi.org/10.1016/j.cose.2021.102210
IF: 5.105
2021-01-01
Computers & Security
Abstract:Digital (forensic) investigations will be increasingly important in both criminal investigations and civil litigations (e.g., corporate espionage, and intellectual property theft) as more of our communications take place over cyberspace (e.g., e-mail and social media platforms). In this paper, we present our proposed Natural Language Processing (NLP)-based digital investigation platform. The platform comprises the data collection and representation phase, the vectorization phase, the feature selection phase, and the classifier generation and evaluation phase. We then demonstrate the potential of our proposed approach using a realworld dataset, whose findings indicate that it outperforms two other competing approaches, namely: LogAnalysis (published in Expert Systems with Applications, 2014) and SIIMCO (published in IEEE Transactions on Information Forensics and Security, 2016). Specifically, our proposed approach achieves 0.65 in F1-score and 0.83 in precision, whilst LogAnalysis and SIIMCO respectively achieve 0.51 and 0.59 in F1-score and 0.49 and 0.58 in precision. (C) 2021 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?