TerraServer SAN-Cluster Architecture and Operations Experience

Tom Barclay,Jim Gray
DOI: https://doi.org/10.48550/arXiv.cs/0502010
2005-02-02
Distributed, Parallel, and Cluster Computing
Abstract:Microsoft TerraServer displays aerial, satellite, and to-pographic images of the earth in a SQL database available via the Internet. It is one of the most popular online at-lases, presenting seventeen terabytes of image data from the United States Geological Survey (USGS). Initially de-ployed in 1998, the system demonstrated the scalability of PC hardware and software - Windows and SQL Server - on a single, mainframe-class processor. In September 2000, the back-end database application was migrated to 4-node active/passive cluster connected to an 18 terabyte Storage Area Network (SAN). The new configuration was designed to achieve 99.99% availability for the back-end application. This paper describes the hardware and software components of the TerraServer Cluster and SAN, and describes our experience in configuring and operating this system for three years. Not surprisingly, the hardware and architecture delivered better than four-9's of availability, but operations mistakes delivered three-9's.
What problem does this paper attempt to address?