Abstract:In cluster systems the execution of a parallel program is non deterministic, which makes the debugging of parallel programs very difficult. This non determinism is mainly caused by all kinds of disturbances or perturbations existing in the running environment.This paper studies the characteristic of perturbation that a debugger imposes on debugged parallel programs while user debugs a parallel program in interactive mode, which is very difficult and very helpful for the design and implementation of a practical debugger in cluster systems. First of all, several techniques that are used to decrease perturbation are briefly discussed. Then, we build our own message passing model of parallel programs in cluster systems. Our model is different from others in that we introduce D max and D min that represent the maximum latency and minimum latency of messages in cluster systems respectively. In order to describe the executive character of parallel programs accurately, we define the terms of state freezing and equivalent execution. Then, we analyze the detailed conditions of perturbation that a debugger imposed on a parallel program. Finally, we find out the conditions under which the debugger would produce perturbation and formally prove these results.According to our results, we design two algorithms that can inform the user of the perturbation that a debugger has imposed on debugged programs in real time. We have developed a debugging tool, DENNET in cluster systems. Those algorithms have been integrated in DENNET and the corresponding debugging mode has been named 'pure mode'. When debugging a parallel program, users can choose 'pure mode' or not. Acknowledge time and latency are two key parameters in our algorithms. At the end of this paper, the testing results of these two parameters are given.

Analyzing nondeterminacy of message passing programs

Petri net modelling of Occam programs for detecting indeterminacy, non-termination and deadlock anomalies

Trace-Based Temporal Verification for Message-Passing Programs

ON-LINE DEBUGGING OF PARALLEL PROGRAMS

Analysis for Intransitive Noninterference Security Properties in Probabilistic Systems

Verification of Nondeterministic Quantum Programs

Algorithmic Analysis of Termination Problems for Nondeterministic Quantum Programs

Verifying the safety properties of distributed systems via mergeable parallelism

A Noninterference Model for Nondeterministic Systems

A Survey of Graph Comparison Methods with Applications to Nondeterminism in High-Performance Computing

Research on Perturbation Imposed on Parallel Programs by Debuggers

An Overview Of Methods For Dependence Analysis Of Concurrent Programs

Unifying Probability with Nondeterminism

An approach to analyzing dependency of concurrent programs

Termination of Nondeterministic Quantum Programs

Approaches to Obtaining Shared Memory Dependences for Dynamic Analysis of Concurrent Programs: A Survey

Reachability and Termination Analysis of Concurrent Quantum Programs

Characterization for communication pattern of message-passing application

A Program Instrumentation for Prefix-Based Tracing in Message-Passing Concurrency

Detecting dead statements for concurrent programs

Rely-Guarantee Based Reasoning for Message-Passing Programs