Abstract:Network of workstations, NOW or COW, is attracting increased attention as a variable platform for high performance parallel computation. It has higher ratio of performance to price, and it is also more flexible and more scalable. But NOW has two major characteristics, nondedicated and heterogeneous, which distinguish the NOW system from conventional multi-processor or other multi-computer systems, and make other parallel computational models unsuitable and inaccurate for it. Thus a realistic parallel computational model, called Nondedicated Heterogeneous Barrier LogGP model, NHBL, is presented for NOW and MPP in this paper. NHBL model is based on the LogGP model and is expanded to fit in with NOW's special characteristics. This model is aimed to reflect the influence of different computing power between workstations and the influence of computations occupied by other user's applications on designing and analysis of parallel algorithms on NOW. This model also presents accurate computation and communication cost models. In this paper, we first describe NHBL model and its computation and communication cost models in details, and show the programming style and the method of NHBL model using PSRS algorithm under MPI environment. Then the computation and communication costs of PSRS algorithm are analyzed with NHBL model. At last, PSRS algorithm is implemented on NHPCC-NOW and Dawning-1000 MPP, which located in National High Performance Computing Center at Hefei, and the analysis results are validated by those experiment results. Experimental results show that NHBL model captures the most important features of NOW, and it is practical and correct for NOW and MPP. Furthermore, NHBL model is a realistic computational model since it can work only with a subset of the parameters that is enough to the design and analysis of algorithms on certain platforms. More experiment data on more platforms and more accurate and simple cost models are our future work.

A Prediction Model For Parallel Back Propagation Neural Network On Smp-Cluster

Hybrid Performance Modeling And Analyzing Of Parallel Systems

WBSP: Addressing Stragglers in Distributed Machine Learning with Worker-Busy Synchronous Parallel

Run-time prediction algorithm for parallel jobs in grid

An Implement of Parallel Module Network Learning Algorithm on Distributed Memory Multiprocessors

Parallelization of Bayesian Network Based SNPs Pattern Analysis and Performance Characterization on SMP/HT

Parallel Module Network Learning on Distributed Memory Multiprocessors

Parallelization of Module Network Structure Learning and Performance Tuning on SMP

Efficient and Robust Parallel DNN Training through Model Parallelism on Multi-GPU Platform

Coded Parallelism for Distributed Deep Learning.

Adaptive Partitioning and Efficient Scheduling for Distributed DNN Training in Heterogeneous IoT Environment

Proteus: Simulating the Performance of Distributed DNN Training

Hybrid Parallel Programming Model for Hierarchical NoC

HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs Training.

Accelerated Synchronous Model Parallelism Using Cooperative Process for Training Compute-Intensive Models

Performance and Energy Consumption of Parallel Machine Learning Algorithms

High Performance Simulation of Spiking Neural Network on GPGPUs

Interference-aware parallelization for deep learning workload in GPU cluster

Interlocking Backpropagation: Improving depthwise model-parallelism

A Realistic Parallel Computational Model