Abstract:The Top-K sparsification-based compression framework is extensively explored for reducing communication costs in distributed learning. However, we identified several issues with existing Top-K sparsification-based compression methods: (i) The limited compressibility of the Top-K parameter's indexes critically restricts the overall communication compression ratio; (ii) Several time-consuming compression operations significantly offset the benefits of communication compression; (iii) The use of error feedback techniques to maintain model quality results in a high memory footprint consumption. To solve these issues, we propose BIRD, a lightweight tensor-wise Bi-Random sampling strategy with an expectation invariance property. Specifically, BIRD applies a tensor-wise index sharing mechanism that reduces the index proportion by allowing multiple tensor elements to share a single index, thus improving the overall compression ratio. Additionally, BIRD replaces the time-consuming Top-K sorting with a faster Bi-Random sampling strategy based on the aforementioned index sharing mechanism, significantly reducing compression overheads; Moreover, BIRD establishes an expectation invariance property into the Bi-Random sampling to ensure an approximate unbiased representation for the L1-norm of the sampled tensors, effectively maintaining the model quality without incurring extra memory costs. We further optimize BIRD to BIRD+ by introducing the uniform distribution-based sampling and Gamma correction on the tensor-wise sampling process, achieving a more flexibly adjustment of the sparsity with better convergence performance. Experimental evaluations across multiple conventional distributed learning tasks demonstrate that compared to state-of-the-art approaches, BIRD+ achieves higher communication compression ratios up to 36.2 × and higher computation throughput up to 149.6 × while maintaining the model quality without incurring extra memory costs.

BIRD: A Lightweight and Adaptive Compressor for Communication-Efficient Distributed Learning Using Tensor-wise Bi-Random Sampling

BIRD+: Design of a Lightweight Communication Compressor for Resource-Constrained Distribution Learning Platforms

Birder: Communication-Efficient 1-Bit Adaptive Optimizer for Practical Distributed DNN Training.

Towards Efficient Network Compression Via Few-Shot Slimming.

FedComp: A Federated Learning Compression Framework for Resource-Constrained Edge Computing Devices

Deep Learning Model Compression with Rank Reduction in Tensor Decomposition.

Holistic CNN Compression Via Low-Rank Decomposition with Knowledge Transfer.

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

Back-and-Forth prediction for deep tensor compression

Efficient Model Compression via Global Sparsification for Over-the-Air Federated Learning.

Neural Network Compression Via Sparse Optimization

Lion Cub: Minimizing Communication Overhead in Distributed Lion

Unified Low-rank Compression Framework for Click-through Rate Prediction

On Compressing Deep Models by Low Rank and Sparse Decomposition.

Compressed Communication for Distributed Training: Adaptive Methods and System

Low-Rate Feature Compression for Collaborative Intelligence: Reducing Redundancy in Spatial and Statistical Levels

Supervised Compression for Resource-Constrained Edge Computing Systems

Decentralized Deep Learning with Arbitrary Communication Compression

Combining CBAM and Iterative Shrinkage-Thresholding Algorithm for Compressive Sensing of Bird Images

Adaptive Compression for Communication-Efficient Distributed Training

LCS: Learning Compressible Subspaces for Adaptive Network Compression at Inference Time