Abstract:Dynamic convolution learns a linear mixture of n static kernels weighted with their input-dependent attentions, demonstrating superior performance than normal convolution. However, it increases the number of convolutional parameters by n times, and thus is not parameter efficient. This leads to no research progress that can allow researchers to explore the setting n>100 (an order of magnitude larger than the typical setting n<10) for pushing forward the performance boundary of dynamic convolution while enjoying parameter efficiency. To fill this gap, in this paper, we propose KernelWarehouse, a more general form of dynamic convolution, which redefines the basic concepts of ``kernels", ``assembling kernels" and ``attention function" through the lens of exploiting convolutional parameter dependencies within the same layer and across neighboring layers of a ConvNet. We testify the effectiveness of KernelWarehouse on ImageNet and MS-COCO datasets using various ConvNet architectures. Intriguingly, KernelWarehouse is also applicable to Vision Transformers, and it can even reduce the model size of a backbone while improving the model accuracy. For instance, KernelWarehouse (n=4) achieves 5.61%|3.90%|4.38% absolute top-1 accuracy gain on the ResNet18|MobileNetV2|DeiT-Tiny backbone, and KernelWarehouse (n=1/4) with 65.10% model size reduction still achieves 2.29% gain on the ResNet18 backbone. The code and models are available at <a class="link-external link-https" href="https://github.com/OSVAI/KernelWarehouse" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### What problem does this paper attempt to solve? This paper aims to solve the contradiction between parameter efficiency and performance in dynamic convolution. Specifically, dynamic convolution improves model performance by using a linear combination of multiple static convolution kernels, but this approach significantly increases the number of model parameters, resulting in a large - sized model and lack of parameter efficiency. Therefore, it is difficult for researchers to explore larger - scale dynamic convolution settings (for example, \(n > 100\)), which limits the further improvement of its performance potential. To solve this problem, the authors propose **KernelWarehouse**, a more general form of dynamic convolution. By redefining the basic concepts of "convolution kernel", "assembling convolution kernel" and "attention function", and using the convolution parameter dependencies in the same layer and adjacent layers, higher parameter efficiency and stronger representation ability are achieved. Specific objectives include: 1. **Improve parameter efficiency**: By reducing the dimension of convolution kernels and sharing parameters, more convolution kernels can be used without significantly increasing the model parameters. 2. **Explore a larger number of convolution kernels**: It is able to explore the number of convolution kernel settings that are an order of magnitude larger than existing methods (for example, \(n > 100\)) while maintaining parameter efficiency, so as to push the performance boundaries of dynamic convolution. 3. **Improve model performance**: Verify the effectiveness of KernelWarehouse on the ImageNet and MS - COCO datasets, and show its superior performance on different ConvNet architectures. ### Main contributions - Proposed **KernelWarehouse**, a more efficient dynamic convolution design that can maintain parameter efficiency while significantly increasing the number of convolution kernels. - Redefined the basic concepts of dynamic convolution through three key components - Kernel Partition, Warehouse Construction - with - Sharing, and Contrasting - driven Attention Function. - Conducted extensive experiments on multiple datasets and model architectures to verify the effectiveness and superiority of KernelWarehouse. Through these improvements, KernelWarehouse can not only significantly improve model performance, but also effectively control the number of model parameters, providing new ideas and methods for the research of dynamic convolution.

KernelWarehouse: Rethinking the Design of Dynamic Convolution

KernelWarehouse: Towards Parameter-Efficient Dynamic Convolution

Omni-Dimensional Dynamic Convolution

Strengthening Dynamic Convolution With Attention and Residual Connection in Kernel Space

Learning Lightweight Dynamic Kernels With Attention Inside via Local–Global Context Fusion

Are Large Kernels Better Teachers than Transformers for ConvNets?

Shift-ConvNets: Small Convolutional Kernel with Large Kernel Effects

InceptionNeXt: When Inception Meets ConvNeXt

A Pre-defined Sparse Kernel Based Convolution for Deep CNNs

Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

Deep Clustered Convolutional Kernels

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Vision Transformers

Depth-Wise Convolutions in Vision Transformers for Efficient Training on Small Datasets

An Efficient Kernel Transformation Architecture for Binary- and Ternary-Weight Neural Network Inference.

Efficient Higher-order Convolution for Small Kernels in Deep Learning

DualConv: Dual Convolutional Kernels for Lightweight Deep Neural Networks

Majority Kernels: An Approach to Leverage Big Model Dynamics for Efficient Small Model Training

XSepConv: Extremely Separated Convolution for Efficient Deep Networks with Large Kernels

Structured Convolutions for Efficient Neural Network Design

Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets