2002 Performance / Price Sort and PennySort

Peng Liu,Yao Shi,Xiaoge Wang
2002-01-01
Abstract:Sort Benchmark 1 is a set of world 's records that evaluate the progress that computer technology has been making on transaction processing. In the various sort benchmarks, the PennySort and Performance / Price Sort were defined to test the maximum cost efficiency of sort machines. Our program, THSort, is a two passes external sort program designed to exploit the potential of inexpensive general machines. Runs on our customized computer, THSort is able to sort 9.8 GB data 2 (105,000,000 records of 100 bytes each) for a penny, quite double last year 's record (4.19 GB). The paper presents our considerations when we custom our system, and reports its PennySort and Performance / Price Sort results, as well as Datamation Sort and Minute Sort results. The paper also addresses the necessity that PennySort to be revised to Performance / Price Sort, and provides a simpler method to calculate Performance / Price Sort result. About the team: High Performance Institute at Tsinghua University is the first one that started research on clusters (since 1995) and grid (since 1998) in China. We focus on parallel/distributed high performance computing and high performance server. That is why we get fascinated in the sort benchmark, since we are curious about how to release the potential I/O capability of the computer. Moreover, as sort is a most common and time consuming task in transaction processing, sort benchmark can be used as an important performance criteria of the servers, especially the database servers. Our sort benchmark team also includes Kuo Zhang, Tian Wang, ZunChong Tian, and Hao Wang. In the past few weeks , we have divided our team into two groups, working in collaboration and competition. This report also includes their contributions.
What problem does this paper attempt to address?