Push-sum Distributed Dual Averaging Online Convex Optimization With Bandit Feedback

Ju Yang,Mengli Wei,Yan Wang,Zhongyuan Zhao
DOI: https://doi.org/10.1007/s12555-023-0211-3
IF: 2.964
2024-05-11
International Journal of Control Automation and Systems
Abstract:This paper investigates the distributed online convex optimization problem in multi-agent systems, where each node cannot directly access the gradient information of its own cost function. The communication topology is formed by the strongly connected time-varying directed graphs with the column stochastic weight matrices, where each node updates its own decisions by exchanging information with neighbouring nodes. It is not feasible to sample objective function values at several consecutive points simultaneously since the online setting is time-varying. To solve this problem over directed graphs, a push-sum one-point bandit distributed dual averaging (PS-OBDDA) algorithm is proposed, where the one-point gradient estimator is employed to estimate the true gradient information, to guide the updating of the decision variables. Moreover, by selecting the appropriate exploration parameter δ and step sizes α ( t ), the algorithm is shown to achieve the sublinear regret bound with the convergence rate . Furthermore, the effect of one-point estimation parameters on the regret of the algorithm in online settings is explored. Finally, the performance of the algorithm is evaluated through simulation.
automation & control systems
What problem does this paper attempt to address?