Straggler Mitigation for Distributed Behavioral Simulation

Eman Bin Khunayn,Shanika Karunasekera,Hairuo Xie,Kotagiri Ramamohanarao
DOI: https://doi.org/10.1109/icdcs.2017.165
2017-01-01
Abstract:Running large-scale behavioral simulations requires high computational power, which can be acquired by distributing computation workload to multiple computing nodes (i.e., workers) that run in parallel. The implementations of such systems commonly follow the Bulk Synchronous Parallel (BSP) model. However, implementations using BSP usually suffer from the straggler problem, where the delay of any worker slows down the entire simulation. The problem usually occurs due to communication delays or imbalanced workload among workers. To mitigate the straggler problem, we propose a novel parallel computational model, called Priority Synchronous Parallel (PSP) model. PSP exploits data dependencies of parallel processes to determine high priority data to be computed and synchronized while computing the remaining data. PSP is implemented and evaluated using traffic simulations for three large cities. The proposed technique shows significant performance improvements over the BSP model.
What problem does this paper attempt to address?