Learning Context-Aware Probabilistic Maximum Coverage Bandits: A Variance-Adaptive Approach
Xutong Liu,Jinhang Zuo,Junkai Wang,Zhiyong Wang,Yuedong Xu,John C. S. Lui
DOI: https://doi.org/10.1109/infocom52122.2024.10621257
2024-01-01
Abstract:Probabilistic maximum coverage (PMC) is an important framework that can model many network applications, including mobile crowdsensing, content delivery, and task replication. In PMC, an operator chooses nodes in a graph that can probabilistically cover other nodes, aiming to maximize the total rewards from the covered nodes. To tackle the challenge of unknown parameters in network environments, PMC are studied under the online learning context, i.e., the PMC bandit. However, existing PMC bandits lack context-awareness and fail to exploit valuable contextual information, limiting their efficiency and adaptability in dynamic environments. To address this limitation, we propose a novel context-aware PMC bandit model (C-PMC). C-PMC employs a linear structure to model the mean outcome of each arm, effectively incorporating contextual information and enhancing its applicability to large-scale network systems. Then we design a variance-adaptive contextual combinatorial upper confidence bound algorithm (VAC2UCB), which utilizes second-order statistics, specifically variance, to re-weight feedback data and estimate unknown parameters. Our theoretical analysis shows that C-PMC achieves a regret of (O) over tilde (dp root vertical bar V vertical bar T), independent of the number of edges vertical bar E vertical bar and action size K. Finally, we conduct experiments on synthetic and real-world datasets, showing the superior performance of VAC(2)UCB in context-aware mobile crowdsensing and user-targeted content delivery applications.