A Task Migration Mechanism for MPI Applications

Youhui Zhang,Dan Pei,Dongsheng Wang,Weimin Zheng
1999-01-01
Abstract:Recently, the Cluster of Computers (COC) has been used to run large parallel programs increasingly. Task migration is a desirable and useful facility to implement Load-Balance and High-Availibility in COCs. This paper presents a quick migration protocol for MPI tasks, which allows non- migrating tasks to execute during most of the time of migration. Process table updating and synchronization are key mechanisms of this protocol. Because MPI does not make provisions for tasks migration, this paper also describes the work required to modify an MPI implementation to allow task migration. At last we introduce our task migration system which is completed grounded on this protocol.
What problem does this paper attempt to address?