Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced Graphs

Zhongyuan Zhao,Zhiqiang Yang,Luyao Jiang,Ju Yang,Quanbo Ge
DOI: https://doi.org/10.1109/jas.2024.124656
2024-10-12
IEEE/CAA Journal of Automatica Sinica
Abstract:This paper considers the distributed online optimization (DOO) problem over time-varying unbalanced networks, where gradient information is explicitly unknown. To address this issue, a privacy-preserving distributed online one-point residual feedback (OPRF) optimization algorithm is proposed. This algorithm updates decision variables by leveraging one-point residual feedback to estimate the true gradient information. It can achieve the same performance as the two-point feedback scheme while only requiring a single function value query per iteration. Additionally, it effectively eliminates the effect of time-varying unbalanced graphs by dynamically constructing row stochastic matrices. Furthermore, compared to other distributed optimization algorithms that only consider explicitly unknown cost functions, this paper also addresses the issue of privacy information leakage of nodes. Theoretical analysis demonstrate that the method attains sublinear regret while protecting the privacy information of agents. Finally, numerical experiments on distributed collaborative localization problem and federated learning confirm the effectiveness of the algorithm.
automation & control systems
What problem does this paper attempt to address?