Pursuing Overall Welfare in Federated Learning through Sequential Decision Making

Seok-Ju Hahn,Gi-Soo Kim,Junghye Lee
2024-05-31
Abstract:In traditional federated learning, a single global model cannot perform equally well for all clients. Therefore, the need to achieve the client-level fairness in federated system has been emphasized, which can be realized by modifying the static aggregation scheme for updating the global model to an adaptive one, in response to the local signals of the participating clients. Our work reveals that existing fairness-aware aggregation strategies can be unified into an online convex optimization framework, in other words, a central server's sequential decision making process. To enhance the decision making capability, we propose simple and intuitive improvements for suboptimal designs within existing methods, presenting AAggFF. Considering practical requirements, we further subdivide our method tailored for the cross-device and the cross-silo settings, respectively. Theoretical analyses guarantee sublinear regret upper bounds for both settings: $\mathcal{O}(\sqrt{T \log{K}})$ for the cross-device setting, and $\mathcal{O}(K \log{T})$ for the cross-silo setting, with $K$ clients and $T$ federation rounds. Extensive experiments demonstrate that the federated system equipped with AAggFF achieves better degree of client-level fairness than existing methods in both practical settings. Code is available at <a class="link-external link-https" href="https://github.com/vaseline555/AAggFF" rel="external noopener nofollow">this https URL</a>
Machine Learning,Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve client - level fairness in Federated Learning (FL). A single global model in traditional federated learning cannot perform equally well for all clients. Therefore, a dynamic aggregation strategy that can adapt to the local signals of each client is required to update the global model, ensuring that each client can benefit from the trained global model. Specifically, the paper points out: 1. **Statistical heterogeneity problem**: Due to differences in data distributions among different clients, the performance of a single global model on each client is unbalanced. This statistical heterogeneity will damage client - level fairness. 2. **Limitations of existing methods**: Although existing fairness - aware aggregation strategies have proposed some improvement schemes, most of these methods are based on static aggregation schemes and fail to fully consider the need for dynamic adjustment. 3. **Decision - making problems in the case of insufficient samples**: In each federated learning round, the server can only receive limited local signals (such as loss values), which makes it difficult to accurately update the mixing coefficients, especially when the number of clients is large (for example, in cross - device settings). To solve the above problems, the authors propose a new framework named AAggFF. By unifying existing fairness - aware aggregation methods into the Online Convex Optimization (OCO) framework and specifically designing for different application scenarios (cross - institutional and cross - device), client - level fairness is improved. ### Formula summary - **FL objective function**: \[ \min_{\theta \in \mathbb{R}^d} F(\theta)=\sum_{i = 1}^K p_i F_i(\theta) \] where \(p_i\geq0\), and \(\sum_{i = 1}^K p_i = 1\). - **Update rule in the OCO framework**: \[ p(t + 1)=\arg\min_{p\in\Delta^{K - 1}}\ell(t)(p)+\eta R(p) \] where \(\ell(t)(p)=-\langle p, r(t)\rangle\) is the decision loss, \(R(p)\) is the regularization term, and \(\eta\) is the step size. - **Negative log growth as decision loss**: \[ \ell(t)(p)=-\log(1+\langle p, r(t)\rangle) \] - **Update rule of AAggFF - S (cross - institutional setting)**: \[ p(t + 1)=\arg\min_{p\in\Delta^{K - 1}}\sum_{\tau = 1}^t\tilde{\ell}(\tau)(p)+\alpha\|p\|^2_2+\beta\sum_{\tau = 1}^t(\langle g(\tau), p - p(\tau)\rangle)^2 \] - **Update rule of AAggFF - D (cross - device setting)**: \[ p(t + 1)=\arg\min_{p\in\Delta^{K - 1}}\sum_{\tau = 1}^t\tilde{\ell}(\tau)(p)+\eta(t + 1)R(p) \] where \(R(p)=\sum_{i = 1}^K p_i\log p_i\) is the negative entropy regularization term. ### Theoretical guarantee The paper provides a theoretical analysis and proves the sublinear regret upper bounds of AAggFF - S and AAggFF - D in different settings: - **Cross - institutional setting**: \(O(L_\infty K\log T)\) - **Cross - device setting**: \(O(L_\infty\sq\)