ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data
Liana Patel,Peter Kraft,Carlos Guestrin,Matei Zaharia
2024-03-08
Abstract:Applications increasingly leverage mixed-modality data, and must jointly
search over vector data, such as embedded images, text and video, as well as
structured data, such as attributes and keywords. Proposed methods for this
hybrid search setting either suffer from poor performance or support a severely
restricted set of search predicates (e.g., only small sets of equality
predicates), making them impractical for many applications. To address this, we
present ACORN, an approach for performant and predicate-agnostic hybrid search.
ACORN builds on Hierarchical Navigable Small Worlds (HNSW), a state-of-the-art
graph-based approximate nearest neighbor index, and can be implemented
efficiently by extending existing HNSW libraries. ACORN introduces the idea of
predicate subgraph traversal to emulate a theoretically ideal, but impractical,
hybrid search strategy. ACORN's predicate-agnostic construction algorithm is
designed to enable this effective search strategy, while supporting a wide
array of predicate sets and query semantics. We systematically evaluate ACORN
on both prior benchmark datasets, with simple, low-cardinality predicate sets,
and complex multi-modal datasets not supported by prior methods. We show that
ACORN achieves state-of-the-art performance on all datasets, outperforming
prior methods with 2-1,000x higher throughput at a fixed recall.
Databases,Information Retrieval