Abstract:Efficiently capturing the long-range patterns in sequential data sources salient to a given task -- such as classification and generative modeling -- poses a fundamental challenge. Popular approaches in the space tradeoff between the memory burden of brute-force enumeration and comparison, as in transformers, the computational burden of complicated sequential dependencies, as in recurrent neural networks, or the parameter burden of convolutional networks with many or large filters. We instead take inspiration from wavelet-based multiresolution analysis to define a new building block for sequence modeling, which we call a MultiresLayer. The key component of our model is the multiresolution convolution, capturing multiscale trends in the input sequence. Our MultiresConv can be implemented with shared filters across a dilated causal convolution tree. Thus it garners the computational advantages of convolutional networks and the principled theoretical motivation of wavelet decompositions. Our MultiresLayer is straightforward to implement, requires significantly fewer parameters, and maintains at most a $\mathcal{O}(N\log N)$ memory footprint for a length $N$ sequence. Yet, by stacking such layers, our model yields state-of-the-art performance on a number of sequence classification and autoregressive density estimation tasks using CIFAR-10, ListOps, and PTB-XL datasets.

What problem does this paper attempt to address?

This paper proposes a new sequence modeling method aimed at efficiently capturing long-range patterns in sequence data that are important for specific tasks (such as classification, generative modeling, etc.). The main contributions include: 1. **Multi-Resolution Convolution Layer (MULTIRES LAYER)**: Inspired by wavelet multi-resolution analysis (MRA), the authors designed a new sequence modeling building block—MULTIRES LAYER, whose core is multi-resolution convolution (MULTIRES CONV). This convolution can capture trends at different time scales in the input sequence. 2. **Efficient Memory Mechanism**: Through multi-resolution convolution operations, memory about past data can be constructed at each time step. To maintain computational efficiency and parameter count, the paper proposes the TREESELECT mechanism to selectively retain part of the representation coefficients as memory vectors. 3. **Theoretical Foundation**: When the filters are set to predefined wavelet filters, multi-resolution convolution can degenerate into traditional discrete wavelet transform. However, the model in the paper allows these filters to be learnable, enabling the model to surpass manually designed wavelet filters. 4. **Simple and Powerful Architecture**: MULTIRES LAYER is built on simple dilated causal convolutions and linear transformations, making it easy to parallelize and parameter-efficient. Additionally, due to its multi-resolution structure, the model is theoretically interpretable. 5. **Experimental Results**: The paper conducts experimental evaluations on multiple sequence classification and autoregressive density estimation tasks, including CIFAR-10 image sequence classification, list operation prediction on ListOps, and multi-label classification of electrocardiograms on the PTB-XL dataset. The experimental results show that the proposed model achieves state-of-the-art performance on these tasks. In summary, this paper aims to address the problem of capturing long-range dependencies in sequence modeling by introducing a new architecture based on multi-resolution analysis and demonstrates the effectiveness and superiority of this approach.

Sequence Modeling with Multiresolution Convolutional Memory

Reparameterized Multi-Resolution Convolutions for Long Sequence Modelling

MixCon: A Hybrid Architecture for Efficient and Adaptive Sequence Modeling

Efficiently Modeling Long Sequences with Structured State Spaces

Long Sequence Hopfield Memory

Spontaneous Temporal Grouping Neural Network for Long-Term Memory Modeling

Non-local Recurrent Neural Memory for Supervised Sequence Modeling

Temporal Convolutional Attention-based Network For Sequence Modeling

Combining Recurrent, Convolutional, and Continuous-time Models with Linear State-Space Layers

Residual Attention Net for Superior Cross-Domain Time Sequence Modeling

Learning Sequence Representations by Non-local Recurrent Neural Memory

Multiscale Low-Frequency Memory Network for Improved Feature Extraction in Convolutional Neural Networks

Long sequence Hopfield memory*

Multiscale sequence modeling with a learned dictionary

Efficient Multi-Sequence Memory with Controllable Steady-State Period and High Sequence Storage Capacity

Multi-Memory Convolutional Neural Network for Video Super-Resolution.

Convolutional State Space Models for Long-Range Spatiotemporal Modeling

CNN with large memory layers

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions