Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Harry Jake Cunningham,Giorgio Giannone,Mingtian Zhang,Marc Peter Deisenroth
2024-08-18
Abstract:Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modelling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to effectively capture long - distance dependencies in long - sequence modeling while avoiding over - fitting and high computational complexity issues during the training process. Specifically, the author points out that current deep - learning models such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Transformers face challenges such as training instability, inability to effectively prioritize important context information, and high computational complexity when processing long - sequence data. In addition, although State - Space Models (SSMs) perform well in handling extremely long sequences, their complex mathematics and linear - algebra operations become bottlenecks, affecting the speed of downstream convolutions. To solve these problems, the paper introduces a new method - Reparameterized Multi - Resolution Convolution (MRConv) for global convolutional kernel parameterization in long - sequence modeling. By leveraging multi - resolution convolution, structural re - parameterization, and introducing learnable kernel decay, MRConv can learn highly expressive long - distance kernels and perform well on various data modalities. Experimental results show that MRConv achieves state - of - the - art performance in convolutional models and linear - time transformers such as the Long - Range Arena, Sequential CIFAR, and Speech Commands tasks, and also reports improved performance in the ImageNet classification task by replacing 2D convolutional layers with 1D MRConv layers.