Robust Federated Learning Over the Air: Combating Heavy-Tailed Noise with Median Anchored Clipping

Jiaxing Li,Zihan Chen,Kai Fong Ernest Chong,Bikramjit Das,Tony Q. S. Quek,Howard H. Yang
2024-09-23
Abstract:Leveraging over-the-air computations for model aggregation is an effective approach to cope with the communication bottleneck in federated edge learning. By exploiting the superposition properties of multi-access channels, this approach facilitates an integrated design of communication and computation, thereby enhancing system privacy while reducing implementation costs. However, the inherent electromagnetic interference in radio channels often exhibits heavy-tailed distributions, giving rise to exceptionally strong noise in globally aggregated gradients that can significantly deteriorate the training performance. To address this issue, we propose a novel gradient clipping method, termed Median Anchored Clipping (MAC), to combat the detrimental effects of heavy-tailed noise. We also derive analytical expressions for the convergence rate of model training with analog over-the-air federated learning under MAC, which quantitatively demonstrates the effect of MAC on training performance. Extensive experimental results show that the proposed MAC algorithm effectively mitigates the impact of heavy-tailed noise, hence substantially enhancing system robustness.
Machine Learning,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to deal with the heavy - tailed noise problem caused by electromagnetic interference when performing model aggregation via over - the - air computation in Federated Edge Learning. Specifically: 1. **Communication Bottleneck and Privacy Protection**: Utilizing the superposition property of over - the - air computation can effectively address the communication bottleneck problem in Federated Edge Learning, while enhancing system privacy and reducing implementation costs. 2. **Impact of Heavy - Tailed Noise**: However, the inherent electromagnetic interference in wireless channels usually exhibits a heavy - tailed distribution, which will cause extremely strong noise in the global aggregated gradient, thereby significantly degrading the training performance. To solve this problem, the authors propose a new gradient clipping method - Median Anchored Clipping (MAC). The MAC method aims to mitigate the impact of heavy - tailed noise on the training process through the following steps: - **Centralization**: Subtract the median value from each element of the global aggregated gradient to minimize the L - 1 deviation. - **Clipping**: Clip the centralized gradient values and limit their range within a specified threshold. - **Recovery**: Add the median back to each element to restore the gradient information. Through these steps, the MAC method can largely preserve the original gradient information while effectively alleviating the adverse effects brought by heavy - tailed noise. In addition, the authors also derive the convergence rate of simulated over - the - air Federated Learning under the MAC algorithm and verify the effectiveness of the MAC algorithm through a large number of experiments. The experimental results show that the MAC algorithm can significantly improve the robustness and training stability of the system, especially performing excellently under extreme noise conditions. In summary, the main contributions of this paper include: - Proposing a new gradient clipping method MAC for mitigating the impact of heavy - tailed noise. - Deriving the convergence rate formula under the MAC algorithm. - Proving the effectiveness and robustness of the MAC algorithm through experiments. The formulas are as follows: - Definition of median: \[ \text{med}(w)=\text{median}\{w_i, i\in [d]\} \] - Centralization operation: \[ g_k\leftarrow g_k - \text{med}(g_k)\cdot\mathbf{1} \] - Clipping operation: \[ g_{k,i}\leftarrow \text{sgn}(g_{k,i})\cdot\min(|g_{k,i}|, C) \] - Recovery operation: \[ \check{g}_k\leftarrow g_k+\text{med}(g_k)\cdot\mathbf{1} \] These steps together ensure the effectiveness and robustness of the MAC algorithm.