Differentially Private and Budget-Limited Bandit Learning over Matroids

Kai Han,Yuntian He,Alex X. Liu,Shaojie Tang,He Huang
DOI: https://doi.org/10.1287/ijoc.2019.0903
IF: 3.288
2020-01-01
INFORMS Journal on Computing
Abstract:We propose the first budget-limited multi-armed bandit (BMAB) algorithm subject to a union of matroid constraints in arm pulling, while at the same time achieving differential privacy. Our model generalizes the arm-pulling models studied in prior BMAB schemes, and it can be used to address many practical problems such as network backbone construction and dynamic pricing in crowdsourcing. We handle the exploitation versus exploration tradeoff in our BMAB problem by exploiting the combinatorial structures of matroids, and reduce the searching complexity of arm selection based on a divide-and-conquer approach. Our algorithm achieves a uniform logarithmic regret bound with respect to B and ɛ-differential privacy, where B is the budget for pulling the arms with random costs. Without differential privacy, our algorithm achieves a uniform logarithmic regret bound with respect to B, which advances the asymptotic regret bounds achieved by prior BMAB algorithms. We performed side-by-side comparisons with prior schemes in our experiments. Experimental results show that our purely-combinatorial algorithm not only achieves significantly better regret performance, but also is more than 20 times faster than prior BMAB schemes, which use time-consuming LP-solving techniques.
What problem does this paper attempt to address?