Progress in development of generally applicable method for monitoring parallel software

Zhang Yanyuan,Liu Min,Ye Jun,Jiang Liyuan
1995-01-01
Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University
Abstract:Ref. does not offer any information for monitoring 3L Parallel FORTRAN. To meet this monitoring need, the authors have developed a method that is generally applicable to monitoring various parallel programs. In fact, due to the great difficulties involved, there exist, to the authors' best knowledge, only special systems for monitoring specific parallel software such as GMAT of Cray and INSTANT of BBN. The authors' strategy can be essentially described by what follows. The authors place a great number of probes into the user's parallel programs. These probes can monitor any trouble occuring in any part of the parallel programs when it is running and report through a special communication network to a special management system. Thus the users can discover any trouble that occurs when running the parallel programs. As a step towards solving the problems associated with dynamic tracing of parallel programs, the authors developed an appropriate state transition mechanism for tasks. Recently the authors' method was examined by a group of Chinese experts organized by the funding agency. The method was deemed to be novel, dependable, and useful.
What problem does this paper attempt to address?