Research on Perturbation Imposed on Parallel Programs by Debuggers

刘建,沈美明,郑纬民
DOI: https://doi.org/10.3321/j.issn:0254-4164.2002.02.002
2002-01-01
Abstract:In cluster systems the execution of a parallel program is non deterministic, which makes the debugging of parallel programs very difficult. This non determinism is mainly caused by all kinds of disturbances or perturbations existing in the running environment.This paper studies the characteristic of perturbation that a debugger imposes on debugged parallel programs while user debugs a parallel program in interactive mode, which is very difficult and very helpful for the design and implementation of a practical debugger in cluster systems. First of all, several techniques that are used to decrease perturbation are briefly discussed. Then, we build our own message passing model of parallel programs in cluster systems. Our model is different from others in that we introduce D max and D min that represent the maximum latency and minimum latency of messages in cluster systems respectively. In order to describe the executive character of parallel programs accurately, we define the terms of state freezing and equivalent execution. Then, we analyze the detailed conditions of perturbation that a debugger imposed on a parallel program. Finally, we find out the conditions under which the debugger would produce perturbation and formally prove these results.According to our results, we design two algorithms that can inform the user of the perturbation that a debugger has imposed on debugged programs in real time. We have developed a debugging tool, DENNET in cluster systems. Those algorithms have been integrated in DENNET and the corresponding debugging mode has been named 'pure mode'. When debugging a parallel program, users can choose 'pure mode' or not. Acknowledge time and latency are two key parameters in our algorithms. At the end of this paper, the testing results of these two parameters are given.
What problem does this paper attempt to address?