ONNXPruner: ONNX-Based General Model Pruning Adapter

Dongdong Ren,Wenbin Li,Tianyu Ding,Lei Wang,Qi Fan,Jing Huo,Hongbing Pan,Yang Gao
2024-04-11
Abstract:Recent advancements in model pruning have focused on developing new algorithms and improving upon benchmarks. However, the practical application of these algorithms across various models and platforms remains a significant challenge. To address this challenge, we propose ONNXPruner, a versatile pruning adapter designed for the ONNX format models. ONNXPruner streamlines the adaptation process across diverse deep learning frameworks and hardware platforms. A novel aspect of ONNXPruner is its use of node association trees, which automatically adapt to various model architectures. These trees clarify the structural relationships between nodes, guiding the pruning process, particularly highlighting the impact on interconnected nodes. Furthermore, we introduce a tree-level evaluation method. By leveraging node association trees, this method allows for a comprehensive analysis beyond traditional single-node evaluations, enhancing pruning performance without the need for extra operations. Experiments across multiple models and datasets confirm ONNXPruner's strong adaptability and increased efficacy. Our work aims to advance the practical application of model pruning.
Machine Learning
What problem does this paper attempt to address?
The main problems that this paper attempts to solve are two major challenges of existing model pruning techniques in practical applications: 1. **Portability across frameworks and platforms**: Existing pruning algorithms are usually tightly bound to specific deep - learning frameworks (such as Keras, PyTorch, PaddlePaddle, etc.), and it is difficult to migrate them on different platforms. Developers need to recreate the pruning algorithms for each framework and convert them into the ONNX format for deployment. This process is time - consuming and labor - intensive. 2. **Adaptability to complex model structures**: Deep neural networks (DNN) have complex internal connections, and pruning nodes may cause a cascading effect. Existing pruning methods usually only evaluate the pruned nodes and ignore the impact on related nodes, which requires additional components and operations for evaluation. To address these challenges, the author proposes a general - purpose model pruning adapter based on ONNX - **ONNXPruner**. Specifically, ONNXPruner solves these problems in the following ways: - **Improving cross - platform interoperability**: By using the ONNX framework, ONNXPruner can standardize models of different frameworks into the ONNX format, thereby simplifying the conversion and deployment between different deep - learning frameworks and hardware platforms. - **Constructing node association trees**: ONNXPruner introduces node association trees to clearly define the relationships between pruned nodes and their related nodes. This tree structure enables the pruning algorithm to automatically adapt to different model structures and effectively track changes in related nodes. - **Tree - level evaluation method**: Through the node association tree, ONNXPruner has developed a tree - level evaluation method, which can evaluate complex node connection structures without adding additional components, thereby more effectively removing unimportant weights. Through these innovations, ONNXPruner aims to enhance the practical application ability of model pruning, enabling application developers to easily achieve seamless integration and efficient deployment of pruning algorithms.