SLSM : An Efficient Strategy for Lazy Schema Migration on Shared-Nothing Databases

Zhilin Zeng,Hui Li,Xiyue Gao,Hui Zhang,Huiquan Zhang,Jiangtao Cui
2024-04-05
Abstract:By introducing intermediate states for metadata changes and ensuring that at most two versions of metadata exist in the cluster at the same time, shared-nothing databases are capable of making online, asynchronous schema changes. However, this method leads to delays in the deployment of new schemas since it requires waiting for massive data backfill. To shorten the service vacuum period before the new schema is available, this paper proposes a strategy named SLSM for zero-downtime schema migration on shared-nothing databases. Based on the lazy migration of stand-alone databases, SLSM keeps the old and new schemas with the same data distribution, reducing the node communication overhead of executing migration transactions for shared-nothing databases. Further, SLSM combines migration transactions with user transactions by extending the distributed execution plan to allow the data involved in migration transactions to directly serve user transactions, greatly reducing the waiting time of user transactions. Experiments demonstrate that our strategy can greatly reduce the latency of user transactions and improve the efficiency of data migration compared to existing schemes.
Databases
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problems of latency and downtime during schema migration in shared - nothing databases. Specifically, traditional online schema change techniques usually need to wait for a large amount of data migration to be completed, which will lead to an extended service outage period, and in the case of large - scale data, the time cost of schema change is relatively high. #### Background problems 1. **Service outage period**: Traditional methods need to wait for a large amount of data back - filling before the new schema is available, which leads to an extended service outage period. 2. **High latency**: User transactions need to wait for the migration transactions to be completed, which increases the user's response time. 3. **High resource consumption**: The migration process involves a large amount of communication between nodes, increasing the resource consumption of the system. #### Solutions proposed in the paper To solve the above problems, the paper proposes a strategy named SLSM (Lazy Schema Migration for Shared - Nothing Databases), aiming to achieve zero - downtime schema migration. The main features of SLSM include: 1. **Lazy migration**: Through lazy migration, SLSM keeps the data distribution of the old and new schemas consistent, reducing the inter - node communication overhead required to execute migration transactions. 2. **Combining user transactions with migration transactions**: SLSM allows the data in the migration transactions to directly serve user transactions by expanding the distributed execution plan, thereby greatly reducing the waiting time of user transactions. 3. **Optimizing background migration**: SLSM starts a background migration process, gradually covering the entire old table, and finally completes the migration. #### Specific improvement measures - **Migration transaction optimization**: By adjusting the metadata information, make the lease holder of the source data slice consistent with the lease holder of the target slice, reducing the network round - trip communication overhead. - **User transaction optimization**: Merge the migration transaction and the user transaction into a "merged transaction" so that the data in the migration transaction can directly serve the user transaction, reducing the waiting time. Through these improvements, SLSM can significantly improve the efficiency of schema migration, shorten the service outage period, and reduce the response time of user transactions without affecting the normal operation of the system. ### Summary The core problem of the paper is to solve the service outage period and high latency problems during schema migration in distributed shared - nothing databases. By introducing the lazy migration strategy and optimizing the migration and user transaction processing methods, SLSM achieves zero - downtime schema migration and significantly improves the performance and efficiency of the system.