A New MapReduce Framework Based on Virtual IP Mechanism and Load Balancing Strategy

Song Yang,Hao Pingting,Hu Jiejun,Hu Liang,Che Xilong
DOI: https://doi.org/10.2174/1874110x01509010253
2015-01-01
Abstract:MapReduce is an important method for large-scale data processing on parallel architecture. In Hadoop ecosys- tem, MapReduce runs on the application-level, thus it provides system with flexibility. MapReduce is good at offline batch processing and it could accelerate the whole execution time. The deficiency of the MapReduce architecture is a lack in balancing and scalability, thus leads to low efficiency when dealing with large-scale data. In this paper, we propose a new MapReduce framework that is more suitable for Hadoop ecosystem. The framework is based on the virtual IP mechanism and load balancing strategy. Comparative experiments indicate that the new framework achieve twice the per- formance compared to the original MapReduce. Besides, the framework fully meets the environment of Hadoop ecosys- tem, and provides a stable and efficient data processing.
What problem does this paper attempt to address?