Diversity-Aware $k$-Maximum Inner Product Search Revisited

Qiang Huang,Yanhao Wang,Yiqun Sun,Anthony K. H. Tung
2024-02-21
Abstract:The $k$-Maximum Inner Product Search ($k$MIPS) serves as a foundational component in recommender systems and various data mining tasks. However, while most existing $k$MIPS approaches prioritize the efficient retrieval of highly relevant items for users, they often neglect an equally pivotal facet of search results: \emph{diversity}. To bridge this gap, we revisit and refine the diversity-aware $k$MIPS (D$k$MIPS) problem by incorporating two well-known diversity objectives -- minimizing the average and maximum pairwise item similarities within the results -- into the original relevance objective. This enhancement, inspired by Maximal Marginal Relevance (MMR), offers users a controllable trade-off between relevance and diversity. We introduce \textsc{Greedy} and \textsc{DualGreedy}, two linear scan-based algorithms tailored for D$k$MIPS. They both achieve data-dependent approximations and, when aiming to minimize the average pairwise similarity, \textsc{DualGreedy} attains an approximation ratio of $1/4$ with an additive term for regularization. To further improve query efficiency, we integrate a lightweight Ball-Cone Tree (BC-Tree) index with the two algorithms. Finally, comprehensive experiments on ten real-world data sets demonstrate the efficacy of our proposed methods, showcasing their capability to efficiently deliver diverse and relevant search results to users.
Information Retrieval,Data Structures and Algorithms,Databases
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the diversity of results while maintaining the relevance of results in the recommendation system. Specifically, the traditional k - Maximum Inner Product Search (kMIPS) method mainly focuses on efficiently retrieving items highly relevant to users, but often ignores the diversity of search results. Diversity is very important for enhancing the user experience because it can reduce the risk of users being overwhelmed by overly homogeneous suggestions and guide users to explore different areas of interest. To this end, the paper re - examines and improves the diversity - aware kMIPS (D - kMIPS) problem. By introducing two known diversity goals - minimizing the average and maximum pairwise similarity of items within the results - to enhance the original relevance goal. This improvement draws on the idea of Maximal Marginal Relevance (MMR), providing users with a method for making a controllable trade - off between relevance and diversity. The main contributions of the paper include: 1. **Problem Redefinition**: Simplify the pre - processing stage, focus on a single space, and propose a new D - kMIPS formula. Evaluate the diversity of the result set by analyzing the inner product between item vectors in the result set. 2. **Algorithm Design**: Propose two linear - scan algorithms, G - REEDY and D - UALGREEDY, which achieve data - dependent approximation factors and data - independent approximation factors respectively. 3. **Optimization Technique**: Introduce optimization techniques to reduce unnecessary re - evaluations in each iteration, significantly improving efficiency. 4. **Index Integration**: Integrate the lightweight Ball - Cone Tree (BC - Tree) index with the two algorithms to further improve query efficiency. The experimental results on multiple real - world datasets show that these methods can effectively provide search results that are both diverse and highly relevant.