A Collaborative Learning Technique for Improved Email Security
Yaser Ali Shah,Nimra Waqar,Um-e-Aimen,Amaad Khalil,Muhammad Bilal Rafaqat,Abid Iqbal
DOI: https://doi.org/10.21015/vtse.v12i2.1807
2024-06-30
VFAST Transactions on Software Engineering
Abstract:In the present era of common email use, the constant challenge of distinguishing between emails that are genuine and spam necessitates the adoption of complex approaches. This study evaluates a Random Forest and Naive Bayes ensemble's performance in handling the difficult problem of email classification by using a voting classifier. The research uses important preprocessing techniques, such as feature selection and data integrity checks in addition to machine learning models, to ensure the validity of the analysis using real email data. Training and evaluating the collaborative learning model—a hybrid of Random Forest and Naive Bayes—focuses on key performance indicators including accuracy and classification reports. Robust techniques are used to address common problems with email data, such as missing values. In particular, our Collaborative Voting Classifier demonstrates its effectiveness as a powerful tool that enhances overall model performance by providing an equitable means of email classification. The results offer a thorough examination of memory, accuracy, and precision together with an understandable illustration made possible by confusion matrices. In this study, we assess the effectiveness of a number of classification algorithms on a particular dataset, including our proposed Voting Classifier, K-Nearest Neighbors, Gaussian Naive Bayes, and Random Forest. With considerable precision (99\%), recall (96\%), and F1-Score (95\%), the proposed Voting Classifier performs exceptionally well overall, with high accuracy (95.9\%). This study offers a thorough viewpoint for real-world classification task applications, giving insightful information about the relative advantages and disadvantages of different methods.