Parallel and distributed architecture of genetic algorithm on Apache Hadoop and Spark

Hao-Chun Lu,F.J. Hwang,Yao-Huei Huang

DOI: https://doi.org/10.1016/j.asoc.2020.106497

IF: 8.7

2020-10-01

Applied Soft Computing

Abstract:<p>The genetic algorithm (GA), one of the best-known metaheuristic algorithms, has been extensively utilized in various fields of management science, operational research, and industrial engineering. The efficiency of GAs in solving large-scale optimization problems would be enhanced if the iterative processes required by the genetic operators can be implemented in a parallel and distributed computing architecture. Apache Hadoop has recently been one of the most popular systems for distributed storage and parallel processing of big data. By integrating the GA highly into Apache Hadoop, this study proposes an advanced GA parallel and distributed computing architecture that achieves the effectiveness and efficiency of GA evolution. Characterized by the sophisticated mechanism of dispatching the GA core operators into Apache Hadoop, the developed computing framework fits well with the cloud computing model. The presented GA parallelization architecture outperforms the state-of-the-art reference architectures according to the computational experiments where the testing instances of traveling salesman problems are employed. Our numerical experiments also demonstrate that the proposed architecture can readily be extended to Apache Spark.</p>

computer science, artificial intelligence, interdisciplinary applications

What problem does this paper attempt to address?

The paper is primarily dedicated to addressing the efficiency improvement of Genetic Algorithms (GA) in large-scale optimization problems, particularly by enhancing the performance of Genetic Algorithms through the adoption of parallel and distributed computing architectures. Specifically, the goals of the paper are: 1. **Developing Parallel and Distributed Genetic Algorithm Architectures**: The researchers propose a parallel and distributed computing architecture for Genetic Algorithms based on Apache Hadoop, aiming to improve the efficiency of Genetic Algorithms in handling large-scale optimization problems. 2. **Integrating Genetic Algorithms with Apache Hadoop**: By efficiently incorporating the core operations of Genetic Algorithms into the Apache Hadoop framework, the study designs an advanced parallel and distributed computing architecture for Genetic Algorithms to achieve effective and efficient evolution. 3. **Addressing Issues in Existing Solutions**: The paper analyzes three existing benchmark models (master-slave model, distributed model, cellular model) and their implementations on MapReduce, pointing out their shortcomings such as premature convergence and low computational efficiency. 4. **Proposing a New Parallel Mechanism**: To overcome the drawbacks of existing models, the research proposes a new parallel mechanism that allows the evaluation, crossover, and mutation operations in Genetic Algorithms to be executed in a parallel and distributed environment, avoiding issues like premature convergence. 5. **Validating the Effectiveness of the Proposed Architecture**: Through computational experiments on the Traveling Salesman Problem instance, the proposed parallel Genetic Algorithm architecture is demonstrated to outperform existing reference architectures, and it can be easily extended to Apache Spark. In summary, the paper aims to provide a more efficient solution for large-scale optimization problems by designing a new parallel and distributed computing architecture for Genetic Algorithms, addressing the inefficiencies of traditional Genetic Algorithms in handling large-scale optimization problems.

Parallel and distributed architecture of genetic algorithm on Apache Hadoop and Spark

Parallel adaptive hybrid genetic optimization algorithm and its application

A Novel Parallel Multi-Objective Genetic Algorithm And Its Application In Process Scheduling

Distributed Parallel Genetic Algorithms Based on Multi-Agent Cooperation

A Parallel distributed genetic algorithm using Apache Spark for flexible scheduling of multitasks in a cloud manufacturing environment

Massively Parallel SPMD Algorithm for Cluster Computing — Combining Genetic Algorithm with Uphill

Highly Scalable Parallel Genetic Algorithm on Sunway Many-Core Processors

PARALLEL LOCAL SEARCH TO IMPROVE THE PERFORMANCE OF GENETIC ALGORITHMS

A distributed genetic algorithm for reactive power optimization

Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks

PGO: A parallel computing platform for global optimization based on genetic algorithm

A Phoenix++ Based New Genetic Algorithm Involving Mechanism of Simulated Annealing

Energy-aware application scheduling based on genetic algorithm.

Parallel Genetic Algorithms on Multiple FPGAs.

THE PARALLEL GENETIC ALGORITHM EMBEDDED WITH DOWNHILL

Parallel Genetic Algorithm to Solve Traveling Salesman Problem on MapReduce Framework using Hadoop Cluster

Evaluating Large Graph Processing in MapReduce Based on Message Passing

Distributed genetic algorithm for application placement in the compute continuum leveraging infrastructure nodes for optimization

A framework for parallel genetic algorithms on PC cluster

Novel parallel hybrid genetic algorithms on the GPU for the generalized assignment problem

Towards parallel genetic algorithms on PC cluster