Auditing of hadoop log file for dynamic detection of threats using H-ISSM-MIM and convolutional neural network

S. Suganya,S. Selvamuthukumaran
DOI: https://doi.org/10.3233/jifs-233579
2023-08-05
Abstract:Hadoop is a big data processing system that enables the distributed processing of massive data sets across multiple computers using straightforward programming techniques. Hadoop has been extensively investigated in many attacks as a result of its growing significance in industry. A company may learn about the actions of invaders as well as the weaknesses of the Hadoop cluster by examining a significant quantity of data from the log file. In a Big Data setting, the goal of the paper is to generate an analytical classification for intrusion detection. In this study, Hadoop log files were examined based on assaults that were recorded in the log files. Prior to analysis, the log data is cleaned and improved using a Hadoop preprocessing tool. For feature extraction, the hybrid Improved Sparrow Search Algorithm with Mutual Information Maximization (H-ISSA-MIM). Then the CNN (Convolutional Neural Network) classifier will detect the intrusions. The implementation is performed using the MATLAB 2020a software. The performance metrics like accuracy, precision, F-score, recall, specificity, FPR, FNR are calculated for the proposed methodology and it is compared with the existing techniques like Decision Tree (DT), Principal Components Analysis (PCA)- K means, Long Short Time Memory (LSTM). The maximum value of accuracy finds out in the proposed method 98% .
What problem does this paper attempt to address?