An Improved Network Traffic Classification Algorithm Based on Hadoop Decision Tree

Zhengwu Yuan,Chaozheng Wang
DOI: https://doi.org/10.1109/icoacs.2016.7563047
2016-01-01
Abstract:In the current age of the Internet, network traffic increased exponentially, either based on user demand for network resources, QoS scheduling, or according to the development trend of network applications for expansion transformation of the existing network, various applications in network traffic need to be classified and identified accurately, network traffic classification is particularly important. C4.5 decision tree algorithm as a commonly used supervised classification algorithm is often applied in traffic classification, but with the increase of data volume, the efficiency of C4.5 algorithm has been reduced. Hadoop platform as open source cloud framework, in dealing with big data has a high performance, so in many cases as the preferred handle large data. On the basis of the original C4.5 algorithm, the improved algorithm is simplified, and the algorithm is parallel to the Hadoop platform, I call it HAC4.5 decision tree algorithm. Experiments show that the improved HAC4.5 decision tree algorithm not only improves the running speed, but also improves the accuracy of the calculation.
What problem does this paper attempt to address?