A Novel Control-flow based Intrusion Detection Technique for Big Data Systems

Santosh Aditham,Nagarajan Ranganathan
DOI: https://doi.org/10.48550/arXiv.1611.07649
2016-11-23
Abstract:Security and distributed infrastructure are two of the most common requirements for big data software. But the security features of the big data platforms are still premature. It is critical to identify, modify, test and execute some of the existing security mechanisms before using them in the big data world. In this paper, we propose a novel intrusion detection technique that understands and works according to the needs of big data systems. Our proposed technique identifies program level anomalies using two methods - a profiling method that models application behavior by creating process signatures from control-flow graphs; and a matching method that checks for coherence among the replica nodes of a big data system by matching the process signatures. The profiling method creates a process signature by reducing the control-flow graph of a process to a set of minimum spanning trees and then creates a hash of that set. The matching method first checks for similarity in process behavior by matching the received process signature with the local signature and then shares the result with all replica datanodes for consensus. Experimental results show only 0.8% overhead due to the proposed technique when tested on the hadoop map-reduce examples in real-time.
Cryptography and Security,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?