On Efficiently Processing MIT Queries in Trajectory Data
Jian Chen,Hong Gao,Kaiqi Zhang,Jiachi Wang,Yubo Luo,Zhenqing Wu,Jianzhong Li
DOI: https://doi.org/10.1109/tkde.2024.3361948
IF: 9.235
2024-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Maximizing Influence (Max-Inf) query is a fundamental operation in spatial data management. Given a set of weighted objects, this query aims to find an optimal location from a candidate set to maximize its influence , which is the total weight of its reverse nearest neighbors. Existing work commonly assumes that every object is in a fixed location. In real life, however, there are a wide variety of drive-in services ( e.g. , food joints, pharmacies, ATMs, etc.) that are widely accessed by mobile users ( i.e. , trajectories) instead of the fixed ones. In this paper, we first define the Maximizing Influence query over Trajectories, namely, MIT query, which aims to find an optimal location to maximize the total weight of influenced trajectories. We propose a novel index, QB-tree to hierarchically group trajectories with similar activity regions together for subsequent unified processing, and classify trajectories inside the same node into multiple buckets according to their motion patterns. For each bucket, we construct a rectilinear polygon using the trajectories in it to exclude some irrelevant areas in the minimum boundary rectangle. Moreover, we develop a branch-and-bound approach called BBM to efficiently solve the MIT query. The algorithm adaptively partitions the candidates into disjoint regions and prunes the regions without containing optimal results. Then, by exploiting the QB-tree, the upper and lower bounds are efficiently computed with three-level pruning technique. Practically, we also study a variant of the MIT query, called MDT query. We propose novel pruning bounds in cooperation with QB-tree to answer MDT queries efficiently. Finally, extensive experiments on real and synthetic datasets demonstrate that our index and algorithms have high performance in terms of efficiency, scalability, and genericity.