A Differentially Private Approach for Budgeted Combinatorial Multi-armed Bandits

Hengzhi Wang,Laizhong Cui,En Wang,Jiangchuan Liu
DOI: https://doi.org/10.1109/tdsc.2024.3401836
2024-01-01
Abstract:As a fundamental tool for sequential decision-making, the Combinatorial Multi-Armed Bandits model (CMAB) has been extensively analyzed and applied in various online applications. However, the privacy concerns in budgeted CMAB are rarely investigated thus far. Few bandit algorithms have adequately addressed the privacy-preserving budgeted CMAB setting. Motivated by this, we study this setting using differential privacy as the formal measure of privacy. In this setting, playing an arm yields both a random reward and a random cost, and these values are kept private. In addition, multiple arms can be played in each round. The objective of the decision-maker is to minimize regret while subject to a budget constraint on the cumulative cost of all played arms. We demonstrate an exploration-exploitation-balanced bandit policy, which preserves the privacy of both rewards and costs under budgeted CMAB settings. This policy is proven differentially private and achieves an upper bound on regret. Furthermore, to provide incentives for the differentially private bandit policy so as to ensure that the reported costs are truthful, we introduce the concept of truthfulness and incorporate a payment mechanism that has been proven to be $\sigma$ -truthful. Numerical simulations based on multiple real-world datasets validate the theoretical findings and demonstrate the effectiveness of our policy compared to state-of-the-art policies.
What problem does this paper attempt to address?