Abstract:There is significant interest in examining large datasets using complex domain-specific queries. In many cases, these queries can be accelerated using specialized indexes. Unfortunately, the development of a practical index is difficult, because databases generally require additional features such as updates, concurrency support, crash recovery, etc. There are three major lines of work to alleviate the pain: (1) automatic index composition/tuning which composes indexes out of core data structure primitives to optimize for specific workloads; (2) generalized index templates which generalize common data structures such as B+-trees for custom queries over custom data types, and (3) data structure dynamization frameworks such as the Bentley-Saxe method which converts a static data structure into an updatable data structure with bounded additional query cost. The first two are limited to very specific queries and/or data structures and, thus, are not suitable for building a general index dynamization framework. The last one is more promising in its generality but also has limitations on query types, deletion support, and performance tuning. In this paper, we discuss the limitations of the classic index dynamization techniques and propose a path towards a more general and systematic solution. We demonstrate the viability of our framework by realizing it as a C++20 metaprogramming library and conducting case studies on four example queries with their corresponding static index structures. With this framework, many theoretical/early-stage index designs can easily be extended with support for updates, along with a wide tuning space for query/update performance trade-offs. This allows index designers to focus on efficient data layouts and query algorithms, thereby dramatically narrowing the gap between novel index designs and deployment.

D(k)-index

An Efficient Structural Index for Graph-Structured Data.

An Adaptive Index of XML for Frequent Branching Path Queries

An Efficient Structural Index for Branching Path Queries.

Effective indexing for dynamic structural graph clustering

Dynamic Indices for Mobile Peer-to-Peer Networks

Scalable Top-K Spatial Keyword Search

Dumpy: A Compact and Adaptive Index for Large Data Series Collections

UD(k, l)-Index: An Efficient Approximate Index for XML Data

Graph Database Indexing Using Structured Graph Decomposition.

Towards Systematic Index Dynamization

DIDS: Double Indices and Double Summarizations for Fast Similarity Search

D -index and Q -index for spanning trees with leaf degree at most k in graphs

Index-based Structural Clustering on Directed Graphs

Efficient structural graph clustering: an index-based approach

Adaptive Hybrid Indexes

A Clustered Dwarf Structure to Speed Up Queries on Data Cubes

A Hybrid Index Structure Based on Multi-Core Cluster

Multidimensional binary search trees used for associative searching

An Efficient and Compact Indexing Scheme for Large-Scale Data Store.

A Survey of Multi-Dimensional Indexes: Past and Future Trends