Abstract:Recent advancements in model pruning have focused on developing new algorithms and improving upon benchmarks. However, the practical application of these algorithms across various models and platforms remains a significant challenge. To address this challenge, we propose ONNXPruner, a versatile pruning adapter designed for the ONNX format models. ONNXPruner streamlines the adaptation process across diverse deep learning frameworks and hardware platforms. A novel aspect of ONNXPruner is its use of node association trees, which automatically adapt to various model architectures. These trees clarify the structural relationships between nodes, guiding the pruning process, particularly highlighting the impact on interconnected nodes. Furthermore, we introduce a tree-level evaluation method. By leveraging node association trees, this method allows for a comprehensive analysis beyond traditional single-node evaluations, enhancing pruning performance without the need for extra operations. Experiments across multiple models and datasets confirm ONNXPruner's strong adaptability and increased efficacy. Our work aims to advance the practical application of model pruning.

What problem does this paper attempt to address?

The main problems that this paper attempts to solve are two major challenges of existing model pruning techniques in practical applications: 1. **Portability across frameworks and platforms**: Existing pruning algorithms are usually tightly bound to specific deep - learning frameworks (such as Keras, PyTorch, PaddlePaddle, etc.), and it is difficult to migrate them on different platforms. Developers need to recreate the pruning algorithms for each framework and convert them into the ONNX format for deployment. This process is time - consuming and labor - intensive. 2. **Adaptability to complex model structures**: Deep neural networks (DNN) have complex internal connections, and pruning nodes may cause a cascading effect. Existing pruning methods usually only evaluate the pruned nodes and ignore the impact on related nodes, which requires additional components and operations for evaluation. To address these challenges, the author proposes a general - purpose model pruning adapter based on ONNX - **ONNXPruner**. Specifically, ONNXPruner solves these problems in the following ways: - **Improving cross - platform interoperability**: By using the ONNX framework, ONNXPruner can standardize models of different frameworks into the ONNX format, thereby simplifying the conversion and deployment between different deep - learning frameworks and hardware platforms. - **Constructing node association trees**: ONNXPruner introduces node association trees to clearly define the relationships between pruned nodes and their related nodes. This tree structure enables the pruning algorithm to automatically adapt to different model structures and effectively track changes in related nodes. - **Tree - level evaluation method**: Through the node association tree, ONNXPruner has developed a tree - level evaluation method, which can evaluate complex node connection structures without adding additional components, thereby more effectively removing unimportant weights. Through these innovations, ONNXPruner aims to enhance the practical application ability of model pruning, enabling application developers to easily achieve seamless integration and efficient deployment of pruning algorithms.

ONNXPruner: ONNX-Based General Model Pruning Adapter

Class-Aware Pruning for Efficient Neural Networks

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

Single-shot Pruning and Quantization for Hardware-Friendly Neural Network Acceleration

Loss Constrains Added Squeeze and Excitation Blocks for Pruning Deep Neural Networks

Not All Data Matters: An End-to-End Adaptive Dataset Pruning Framework for Enhancing Model Performance and Efficiency

Automatic Attention Pruning: Improving and Automating Model Pruning using Attentions

One-Shot Pruning for Fast-adapting Pre-trained Models on Devices

Adaptive Activation-based Structured Pruning

AdaPruner: Adaptive Channel Pruning and Effective Weights Inheritance

Pruner: A Speculative Exploration Mechanism to Accelerate Tensor Program Tuning

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch

A Dynamic Pruning Method on Multiple Sparse Structures in Deep Neural Networks

Non-Parametric Adaptive Network Pruning

Accelerate CNNs from Three Dimensions: A Comprehensive Pruning Framework

GenExp: Multi-objective pruning for deep neural network based on genetic algorithm

Separate, Dynamic and Differentiable (SMART) Pruner for Block/Output Channel Pruning on Computer Vision Tasks

Structurally Prune Anything: Any Architecture, Any Framework, Any Time

PruneAug: Bridging DNN Pruning and Inference Latency on Diverse Sparse Platforms Using Automatic Layerwise Block Pruning

Isomorphic Pruning for Vision Models

One-Cycle Pruning: Pruning ConvNets Under a Tight Training Budget