Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Harry Jake Cunningham,Giorgio Giannone,Mingtian Zhang,Marc Peter Deisenroth

2024-08-18

Abstract:Global convolutions have shown increasing promise as powerful general-purpose sequence models. However, training long convolutions is challenging, and kernel parameterizations must be able to learn long-range dependencies without overfitting. This work introduces reparameterized multi-resolution convolutions ($\texttt{MRConv}$), a novel approach to parameterizing global convolutional kernels for long-sequence modelling. By leveraging multi-resolution convolutions, incorporating structural reparameterization and introducing learnable kernel decay, $\texttt{MRConv}$ learns expressive long-range kernels that perform well across various data modalities. Our experiments demonstrate state-of-the-art performance on the Long Range Arena, Sequential CIFAR, and Speech Commands tasks among convolution models and linear-time transformers. Moreover, we report improved performance on ImageNet classification by replacing 2D convolutions with 1D $\texttt{MRConv}$ layers.

Machine Learning

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to effectively capture long - distance dependencies in long - sequence modeling while avoiding over - fitting and high computational complexity issues during the training process. Specifically, the author points out that current deep - learning models such as Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), and Transformers face challenges such as training instability, inability to effectively prioritize important context information, and high computational complexity when processing long - sequence data. In addition, although State - Space Models (SSMs) perform well in handling extremely long sequences, their complex mathematics and linear - algebra operations become bottlenecks, affecting the speed of downstream convolutions. To solve these problems, the paper introduces a new method - Reparameterized Multi - Resolution Convolution (MRConv) for global convolutional kernel parameterization in long - sequence modeling. By leveraging multi - resolution convolution, structural re - parameterization, and introducing learnable kernel decay, MRConv can learn highly expressive long - distance kernels and perform well on various data modalities. Experimental results show that MRConv achieves state - of - the - art performance in convolutional models and linear - time transformers such as the Long - Range Arena, Sequential CIFAR, and Speech Commands tasks, and also reports improved performance in the ImageNet classification task by replacing 2D convolutional layers with 1D MRConv layers.

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

Rethinking Multi-Contrast MRI Super-Resolution: Rectangle-Window Cross-Attention Transformer and Arbitrary-Scale Upsampling

MCMC: Multi-Constrained Model Compression Via One-Stage Envelope Reinforcement Learning.

Sequence Modeling with Multiresolution Convolutional Memory

Multi-Convformer: Extending Conformer with Multiple Convolution Kernels

RefConv: Re-parameterized Refocusing Convolution for Powerful ConvNets

ConvMLP: Hierarchical Convolutional MLPs for Vision

RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition

Dual-Octave Convolution for Accelerated Parallel MR Image Reconstruction

MULTI-DIRECTIONAL CONVOLUTION NETWORKS WITH SPATIAL-TEMPORAL FEATURE PYRAMID MODULE FOR ACTION RECOGNITION

What Makes Convolutional Models Great on Long Sequence Modeling?

MKConv: Multidimensional Feature Representation for Point Cloud Analysis

Deep Unfolding Convolutional Dictionary Model for Multi-Contrast MRI Super-resolution and Reconstruction

Depthwise Multiception Convolution for Reducing Network Parameters without Sacrificing Accuracy

Multi-Residual Networks: Improving the Speed and Accuracy of Residual Networks

RepNeXt: A Fast Multi-Scale CNN using Structural Reparameterization

GMConv: Modulating Effective Receptive Fields for Convolutional Kernels

Cross-Modality High-Frequency Transformer for MR Image Super-Resolution

Multi-Modal Transformer for Accelerated MR Imaging

Multi-scale MRI reconstruction via dilated ensemble networks

MSEConv: A Unified Warping Framework for Video Frame Interpolation