Abstract:In this paper, we propose a novel neural network structure, namely feedforward sequential memory networks (FSMN), to model long-term dependence in time series without using recurrent feedback. The proposed FSMN is a standard fully connected feedforward neural network equipped with some learnable memory blocks in its hidden layers. The memory blocks use a tapped-delay line structure to encode the long context information into a fixed-size representation as short-term memory mechanism which are somehow similar to the time-delay neural networks layers. We have evaluated the FSMNs in several standard benchmark tasks, including speech recognition and language modeling. Experimental results have shown that FSMNs outperform the conventional recurrent neural networks (RNN) while can be learned much more reliably and faster in modeling sequential signals like speech or language. Moreover, we also propose a compact feedforward sequential memory networks (cFSMN) by combining FSMN with low-rank matrix factorization and make a slight modification to the encoding method used in FSMNs in order to further simplify the network architecture. On the speech recognition Switchboard task, the proposed cFSMN structures can reduce the model size by 60% and speed up the learning by more than seven times while the model can still significantly outperform the popular bidirectional LSTMs for both frame-level cross-entropy criterion-based training and MMI-based sequence training.

A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models

The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models.

Learning FOFE Based FNN-LMs with Noise Contrastive Estimation and Part-of-speech Features

Nonrecurrent Neural Structure for Long-Term Dependence.

Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency.

Feedforward Sequential Memory Neural Networks without Recurrent Feedback

Feedforward Sequential Memory Networks Based Encoder-Decoder Model for Machine Translation

Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition

An Optimization Scheme for Segmented-Memory Neural Network

Two-Stage Label Embedding Via Neural Factorization Machine for Multi-Label Classification

F-MALLOC: Feed-forward Memory Allocation for Continual Learning in Neural Machine Translation

Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation

Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling

A Sequential Neural Encoder with Latent Structured Description for Modeling Sentences.

Deep-FSMN for Large Vocabulary Continuous Speech Recognition

Neurons in Large Language Models: Dead, N-gram, Positional

An Encoder with non-Sequential Dependency for Neural Data-to-Text Generation.

Fixed-Size Objects Encoding for Visual Relationship Detection

Improved training of neural trans-dimensional random field language models with dynamic noise-contrastive estimation.

Improving Transformers using Faithful Positional Encoding

Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model