Optimizing Inter-data-center Large-Scale Database Parallel Replication with Workload-Driven Partitioning

Hong Min,Zhen Gao,Xiao Li,Jie Huang,Yi Jin,Serge Bourbonnais,Miao Zheng,Gene Fuh
DOI: https://doi.org/10.1007/978-3-319-10085-2_38
2014-01-01
Abstract:Inter-data-center asynchronous middleware replication between active-active databases has become essential for achieving continuous business availability. Near real-time replication latency is expected despite intermittent peaks in transaction volumes. Database tables are divided for replication across multiple parallel replication consistency groups; each having a maximum throughput capacity, but doing so can break transaction integrity. It is often not known which tables can be updated by a common transaction. Independent replication also requires balancing resource utilization and latency objectives. Our work provides a method to optimize replication latencies, while minimizing transaction splits among a minimum of parallel replication consistency groups. We present a two-staged approach: a log-based workload discovery and analysis and a history-based database partitioning. The experimental results from a real banking batch workload and a benchmark OLTP workload demonstrate the effectiveness of our solution even for partitioning 1000s of database tables for very large workloads.
What problem does this paper attempt to address?