Abstract:Many improvements to programming have come from shortening feedback loops, for example with Integrated Development Environments, Unit Testing, Live Programming, and Distributed Version Control. A barrier to feedback that deserves greater attention is Schema Evolution. When requirements on the shape of data change then existing data must be migrated into the new shape, and existing code must be modified to suit. Currently these adaptations are often performed manually, or with ad hoc scripts. Manual schema evolution not only delays feedback but since it occurs outside the purview of version control tools it also interrupts collaboration.
Schema evolution has long been studied in databases. We observe that the problem also occurs in non-database contexts that have been less studied. We present a suite of challenge problems exemplifying this range of contexts, including traditional database programming as well as live front-end programming, model-driven development, and collaboration in computational documents. We systematize these various contexts by defining a set of layers and dimensions of schema evolution.
We offer these challenge problems to ground future research on the general problem of schema evolution in interactive programming systems and to serve as a basis for evaluating the results of that research. We hope that better support for schema evolution will make programming more live and collaboration more fluid.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is **the interference of Schema Evolution on the feedback loop**, especially in interactive programming systems. Specifically:
1. **Background problems**:
- Many improvements in programming come from shortening the feedback loop, for example, through means such as integrated development environments (IDE), unit tests, live programming, and distributed version control.
- However, when the requirements of data structures change, the existing data needs to be migrated to a new structure, and the existing code also needs to be modified accordingly to meet the new requirements. Such changes usually need to be done manually or with ad - hoc scripts, which not only delays feedback but also disrupts collaboration.
2. **The universality of Schema Evolution**:
- The problem of Schema Evolution has been widely studied in the database field, but similar problems also exist in non - database contexts (such as front - end programming, model - driven development, and computational document collaboration), but there is less research on these scenarios.
- For example, in live programming, the state is usually transient and is re - created after each edit. But if the long - lasting state structure changes, it will break the illusion of real - time (i.e., hot reloading may fail). Similarly, in collaborative programming, when code changes affect the expected shape of external data, coordination must be carried out outside the version control system.
3. **Objectives**:
- The paper aims to provide a series of challenging problems covering Schema Evolution problems in different contexts, including traditional database programming, real - time front - end programming, model - driven development, and computational document collaboration.
- The author hopes that these problems can provide a basis for future research and serve as a standard for evaluating research results, ultimately making programming more real - time and collaboration more seamless.
### Specific problem examples
#### Challenge problem #1: Real - time state type evolution
- **Scenario**: In a live programming system based on the Elm architecture, the state and event types of the user interface need to be modified without restarting the application.
- **Question**: When the `State` type changes, how can the existing data be automatically migrated and the relevant code be updated to maintain the real - time and consistency of the application?
#### Challenge problem #2: Extract Entity
- **Scenario**: Acme Company needs to refactor order records from a flat table into two tables, one for orders and the other for customer information.
- **Question**: How can duplicate items be automatically merged, unique identifiers be assigned, and ensure that the order table references these identifiers while handling potential data errors (such as misspellings)?
#### Challenge problem #3: Code Co - evolution for Extract Entity
- **Scenario**: In a programming system, in addition to changes in data structures, the related code (such as SQL queries) also needs to be modified synchronously.
- **Question**: How can the synchronous evolution of code and data structures be ensured, minimizing user manual intervention and avoiding the introduction of new errors?
#### Challenge problem #4: Structured Document Edits
- **Scenario**: Multiple collaborators jointly edit a meeting planning document, involving speaker lists, budget calculations, etc.
- **Question**: How can the modifications to the document structure and content by different collaborators be automatically merged to ensure that the final result contains all the newly added data and conforms to the new structural format?
By solving these problems, the author hopes to improve the feedback loop of the programming system and make the programming process more efficient and collaboration - friendly.