Top-k Most Influential Locations Selection

Jin Huang,Zeyi Wen,Jianzhong Qi,Rui Zhang,Jian Chen,Zhen He
DOI: https://doi.org/10.1145/2063576.2063971
2011-01-01
Abstract:We propose and study a new type of facility location selection query, the top-k most influential location selection query. Given a set M of customers and a set F of existing facilities, this query finds k locations from a set C of candidate locations with the largest influence values, where the influence of a candidate location c (c in C) is defined as the number of customers in M who are the reverse nearest neighbors of c. We first present a naive algorithm to process the query. However, the algorithm is computationally expensive and not scalable to large datasets. This motivates us to explore more efficient solutions. We propose two branch and bound algorithms, the Estimation Expanding Pruning (EEP) algorithm and the Bounding Influence Pruning (BIP) algorithm. These algorithms exploit various geometric properties to prune the search space, and thus achieve much better performance than that of the naive algorithm. Specifically, the EEP algorithm estimates the distances to the nearest existing facilities for the customers and the numbers of influenced customers for the candidate locations, and then gradually refines the estimation until the answer set is found, during which distance metric based pruning techniques are used to improve the refinement efficiency. BIP only estimates the numbers of influenced customers for the candidate locations. But it uses the existing facilities to limit the space for searching the influenced customers and achieve a better estimation, which results in an even more efficient algorithm. Extensive experiments conducted on both real and synthetic datasets validate the efficiency of the algorithms.
What problem does this paper attempt to address?