Swarm Intelligence with a Chaotic Leader and a Salp algorithm: HDFS optimization for reduced latency and enhanced availability

N. Jagadish Kumar,D. Dhinakaran,A. Naresh Kumar,A. V. Kalpana
DOI: https://doi.org/10.1002/cpe.8127
2024-04-19
Concurrency and Computation Practice and Experience
Abstract:Summary The Hadoop distributed file system (HDFS) effectively manages data by segmenting it into blocks distributed across DataNodes in its cluster. While default block sizes in Hadoop 2.x and 1.x are 128 and 64 MB, respectively, they can be customized for larger files. HDFS ensures data reliability by replicating blocks across multiple DataNodes, but this can introduce high latency in cloud storage during heavy network traffic, particularly in big data processing. To address this, we introduce swarm intelligence with a chaotic leader and a salp (SI‐CLS) optimization algorithm. This algorithm reduces network traffic between racks in HDFS by optimizing block distribution. The SI‐CLS algorithm calculates a fitness value for each block, aiming to increase data availability and reduce latency. Performance metrics, including latency, data availability, and load balancing, indicate the effectiveness of SI‐CLS. Adopting this algorithm enhances HDFS performance, ensuring better data availability and lower latency, resulting in improved system reliability.
computer science, theory & methods, software engineering
What problem does this paper attempt to address?