Abstract:As an important component of non-speech audio classification technology, environmental sound classification (ESC) has attracted the attention of researchers in recent years. Benefiting from the rapid development of deep learning technology, researchers input manually extracted speech features into a convolutional neural network (CNN) to extract deeper abstract features to complete the final classification task. Make the accuracy of ESC reach a higher level. However, the improvement of accuracy often resorts to the constantly deepening network structure. This caused a lot of parameter redundancy. The huge amount of floating-point operations (FLOPs) also slows down the running speed of CNN and increases the burden on storage and computing resources. To this end, we propose a convolutional neural network pruning technology to compress the CNN model, reduce the amount of CNN parameters and FLOPs, thereby eliminating the redundancy of CNN. Specifically, we use ResNet-20 as the backbone network. First, pre-train the network to make it have good classification performance. Then randomly remove a small number of convolution channels and fine-tune them to restore accuracy. Iterate this process until the model reaches the target compression ratio. We conduct experiments on the UrbanSound8K dataset. Thanks to the strong plasticity of the CNN model, the pruned model does not have a significant decrease in accuracy, even when the compression rate is low, because the network redundancy is reduced, the accuracy will be slightly improved. Our model achieves an accuracy that competes with state-of-the-art method while being lighter in weight.

SVD-Based Channel Pruning for Convolutional Neural Network in Acoustic Scene Classification Model

Structured Deep Neural Network Pruning by Varying Regularization Parameters.

Exploring the Effects of Channel Sparsity on Neural Network Pruning for Acoustic Scene Classification

A Pruning Method Based on the Dissimilarity of Angle among Channels and Filters

A Passive Similarity based CNN Filter Pruning for Efficient Acoustic Scene Classification

A Channel-level Pruning Strategy for Convolutional Layers in CNNs

Prune the Convolutional Neural Networks with Sparse Shrink

Adaptive Pruning of Channel Spatial Dependability in Convolutional Neural Networks

Loss-Driven Channel Pruning of Convolutional Neural Networks.

CCPrune: Collaborative Channel Pruning for Learning Compact Convolutional Networks

Variational Convolutional Neural Network Pruning

DTS: Dynamic Training Slimming with Feature Sparsity for Efficient Convolutional Neural Network

Acoustic Scene Classification Based on Dense Convolutional Networks Incorporating Multi-channel Features

Effective SVD-Based Deep Network Compression for Automatic Speech Recognition.

Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning

Fast Environmental Sound Classification Based on Convolutional Neural Network Pruning Algorithm

Convolutional Neural Network Compression Method Based on Multi-Factor Channel Pruning

An Efficient Channel-level Pruning for CNNs without Fine-tuning

Channel Pruning for Accelerating Very Deep Neural Networks

A CNN Channel Pruning Low-Bit Framework Using Weight Quantization with Sparse Group Lasso Regularization.

Multi-grained Pruning Method of Convolutional Neural Network.