The Design and Architecture of the Microsoft Cluster Service -- A Practical Approach to High-Availability and Scalability

Werner Vogels,Dan Dumitriu,Ken Birman,Rod Gamache,Mike Massa,Rob Short,John Vert,Joe Barrera
DOI: https://doi.org/10.48550/arXiv.cs/9809006
1998-09-03
Abstract:Microsoft Cluster Service (MSCS) extends the Win-dows NT operating system to support high-availability services. The goal is to offer an execution environment where off-the-shelf server applications can continue to operate, even in the presence of node failures. Later ver-sions of MSCS will provide scalability via a node and application management system that allows applications to scale to hundreds of nodes. This paper provides a de-tailed description of the MSCS architecture and the de-sign decisions that have driven the implementation of the service. The paper also describes how some major appli-cations use the MSCS features, and describes features added to make it easier to implement and manage fault-tolerant applications on MSCS.
Operating Systems,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?