Optimizing Convolutional Neural Network Architecture

Luis Balderas,Miguel Lastra,José M. Benítez
2023-12-17
Abstract:Convolutional Neural Networks (CNN) are widely used to face challenging tasks like speech recognition, natural language processing or computer vision. As CNN architectures get larger and more complex, their computational requirements increase, incurring significant energetic costs and challenging their deployment on resource-restricted devices. In this paper, we propose Optimizing Convolutional Neural Network Architecture (OCNNA), a novel CNN optimization and construction method based on pruning and knowledge distillation designed to establish the importance of convolutional layers. The proposal has been evaluated though a thorough empirical study including the best known datasets (CIFAR-10, CIFAR-100 and Imagenet) and CNN architectures (VGG-16, ResNet-50, DenseNet-40 and MobileNet), setting Accuracy Drop and Remaining Parameters Ratio as objective metrics to compare the performance of OCNNA against the other state-of-art approaches. Our method has been compared with more than 20 convolutional neural network simplification algorithms obtaining outstanding results. As a result, OCNNA is a competitive CNN constructing method which could ease the deployment of neural networks into IoT or resource-limited devices.
Computer Vision and Pattern Recognition,Machine Learning
What problem does this paper attempt to address?
The paper focuses on the optimization and construction issues of Convolutional Neural Network (CNN). With the increasing scale and complexity of CNN, its computational demands have also grown, resulting in challenges for deployment on energy-consuming and resource-constrained devices. The paper proposes a new approach called Optimal Convolutional Neural Network Architecture (OCNNA), which is based on pruning and knowledge distillation to evaluate the importance of convolutional layers. Through extensive empirical research, including the CIFAR-10, CIFAR-100, and ImageNet datasets, as well as the VGG-16, ResNet-50, DenseNet-40, and MobileNet architectures, OCNNA compares favorably with state-of-the-art methods in terms of metrics such as accuracy degradation and remaining parameter ratio. This approach is capable of reducing the storage requirements and computational costs of the model, making it suitable for deploying neural networks on IoT or resource-limited devices.