Hand-crafted Attention is All You Need? A Study of Attention on Self-supervised Audio Transformer.

Tsung-Han Wu,Chun-Cheng Hsieh,Yen‐Hao Chen,Po-Han Chi,Hung-yi Lee
2020-01-01
Abstract:In this paper, we seek to reduce the computation complexity of transformer-based models for speech representation learning. We evaluate 10 attention mechanisms; then, we pre-train the transformer-based model with those attentions in a self-supervised fashion and use them as feature extractors on downstream tasks, including phoneme classification and speaker classification. We find that the proposed approach, which only uses hand-crafted and learnable attentions, is comparable with the full self-attention.
What problem does this paper attempt to address?