Efficient Training Acceleration via Sample-Wise Dynamic Probabilistic Pruning

Feicheng Huang,Wenbo Zhou,Yue Huang,Xinghao Ding
DOI: https://doi.org/10.1109/lsp.2024.3484289
2024-11-09
IEEE Signal Processing Letters
Abstract:Data pruning is observed to substantially reduce the computation and memory costs of model training. Previous studies have primarily focused on constructing a series of coresets with representative samples by leveraging predefined rules for evaluating sample importance. Learning dynamics and selection bias, however, are rarely being considered. In this letter, a novel Sample-wise Dynamic Probabilistic Pruning (SwDPP) method is proposed for efficient training. Specifically, instead of hard-pruning the samples that are considered easy or well-learned, we formulate the pruning process as a probabilistic sampling problem. This is achieved by a carefully-designed soft-selection mechanism, which constantly expresses learning dynamics and relaxes selection bias. Moreover, to alleviate the accuracy drop under high pruning rates, we introduce a probabilistic Mixup strategy for information diversity maintenance. Extensive experiments conducted on CIFAR-10, CIFAR-100 and Tiny-ImageNet show that, the proposed SwDPP outperforms current state-of-the-art methods across various pruning settings. Notably, on CIFAR-10 and CIFAR-100, SwDPP achieves lossless training acceleration using only 70% of the data per epoch.
engineering, electrical & electronic
What problem does this paper attempt to address?