Performance Study on an In-Memory OLTP Database
Huanchen Zhang,Zhuo Chen
2014-01-01
Abstract:OLTP (online transaction processing) database systems are essential building blocks for many popular web services (e.g. Amazon) [13]. Traditionally, OLTP systems are no different than other database management systems (DBMS), as they all use some popular and general relational DBMS, such as DB2 [9] and Shore [1]. In the past few years, however, a number of multicore main-memory OLTP databases, including Hekaton [6], VoltDB [4], HStore [10] and Silo [15], have emerged and have been gaining more and more attention due to their extraordinary performance. For example, Silo can achieve 700,000 transactions per second running TPCC benchmark on a 32-core machine [15], which is a 2order-of-magnitude improvement over disk-based relational DBMS. Silo-R, a recent extension of Silo, adds durability to the database and can also achieve 550,000 transactions per second under the same settings [16]. Pushes from 2 aspects contribute to this technical transition. First, OLTP market characteristics make main memory a desirable choice. OLTP market usually deals with business data processing [7], which requires high peak throughput. For example, China’s e-commerce giant Alibaba receives 278 million orders ($9.3 billion worth) during a 24-hour online shopping festival on November 11th, 2014 [14], with its 1st billion worth of orders placed in less than 20 minutes [7]. Such high throughput might be challenging for current diskbased databases to match. On the other hand, main memory has become sufficiently cheap (and faster) to host OLTP datasets. A modern commercial server typically has several terabytes of RAM [15], which is more than enough for most OLTP workloads. Moreover, the size of OLTP systems usually do not scale exponentially as RAM capacity does, because customer and real world entities do not obey Moore’s law [8]. Seven years ago, Harizopoulos et. al. presented a performance breakdown graph running Shore in memory [8]. In this work, we conducted a detailed performance study on Silo [15], a recent state-of-the-art mainmemory database. Compared to Shore, the elimination of buffer manager (centralized page manager) significantly improves table operation (record access) performance, and we found that index operation is now the new bottleneck in Silo. In terms of cache performance, misses mostly happen at last-level cache for table operations, indicating poor locality among table records. The RCU (read-copy-write) region (to support concurrent access) of the underlying B-tree indices also causes a significant number of last-level write misses. In addition, we examined the overhead of Silo’s OCC protocol. We found that under normal workloads, OCC still imposes a significant overhead during commit time, especially when transaction is short and write intensive. As the workloads become skewer or there are more threads, OCC overhead starts to dominate mainly because the data read during transaction execution is more likely to be modified by other threads, causing transaction abort. We also report our findings as we vary the transaction types. For example, we found that the abort rate peaks at the point where there are 50% read transactions and 50% write transactions. The remainder of this report is organized as follows. Section 2 gives an overview of our target system Silo. Section 3 introduces the measurement methodology as well as the benchmark we are using. We report our measurement result in section 4. Finally, we conclude our report in Section 5.