Phishing Email Detection Using Natural Language Processing Techniques: A Literature Survey

Said Salloum,Tarek Gaber,Sunil Vadera,Khaled Shaalan
DOI: https://doi.org/10.1016/j.procs.2021.05.077
2021-01-01
Procedia Computer Science
Abstract:Phishing is the most prevalent method of cybercrime that convinces people to provide sensitive information; for instance, account IDs, passwords, and bank details. Emails, instant messages, and phone calls are widely used to launch such cyber-attacks. Despite constant updating of the methods of avoiding such cyber-attacks, the ultimate outcome is currently inadequate. On the other hand, phishing emails have increased exponentially in recent years, which suggests a need for more effective and advanced methods to counter them. Numerous methods have been established to filter phishing emails, but the problem still needs a complete solution. To the best of our knowledge, this is the first survey that focuses on using Natural Language Processing (NLP) and Machine Learning (ML) techniques to detect phishing emails. This study provides an analysis of the numerous state-of-the-art NLP strategies currently in use to identify phishing emails at various stages of the attack, with an emphasis on ML strategies. These approaches are subjected to a comparative assessment and analysis. This gives a sense of the problem, its immediate solution space, and the expected future research directions.
What problem does this paper attempt to address?