Investigation of network traffic in geo-distributed data centers

yutaka koshiba,wuhui chen,yuichi yamada,takazumi tanaka,incheon paik
DOI: https://doi.org/10.1109/ICAwST.2015.7314042
2015-01-01
Abstract:Understanding characteristics of network traffic in a Hadoop cluster is a key to check existing problems in order to improve them for better performance of MapReduce operations. However, current works is focusing on analyzing MapReduce performance within one single data center, but network traffic in a geo-distributed data centers environment has not been well-studied yet. In this paper, we study the network traffic characteristics in geo-distributed data centers and identify some interesting results. We first construct geo-distributed data centers by adding latency among data center clusters with 18 data nodes. Then we collect traffic log data by running MapReduce applications on the geo-distributed data centers. Finally, by analyzing the log data, we found some interesting results for our future research.
What problem does this paper attempt to address?