Scalable Resource Management System for High Productive Computing

Yutong Lu,Nong Xiao,Xuejun Yang
2008-01-01
Abstract:High Performance Computing is focused on providing high productivity computing system (HPCS), instead of seeking high performance only. HPCS needs more scalable and powerful resource management system. This paper proposes scalable hierarchy resource management architecture with cascade services to support the scalability of HPCS. We design a method of dynamic self-organization services configuration, optimize communication protocol for system management, and construct virtual topology tree to reduce the overhead of resource management system and quicken the large scale parallel job loading. A scalable resource management system (SRMS) have been implemented, and some experiments have been done to evaluate the scalability of SRMS.
What problem does this paper attempt to address?