Abstract:<p>Few-shot learning aims to recognize instances from novel classes with few labeled samples, which has great value in research and application. Although there has been a lot of work in this area recently, most of the existing work is based on image classification tasks. Video-based few-shot action recognition has not been explored well and remains challenging: (1) the differences of implementation details among different papers make a fair comparison difficult; (2) the wide variations and misalignment of temporal sequences make the video-level similarity comparison difficult; (3) the scarcity of labeled data makes the optimization difficult. To solve these problems, this paper presents (1) a specific setting to evaluate the performance of few-shot action recognition algorithms; (2) an implicit sequence-alignment algorithm for better video-level similarity comparison; (3) an advanced loss for few-shot learning to optimize pair similarity with limited data. Specifically, we propose a novel few-shot action recognition framework that uses long short-term memory following 3D convolutional layers for sequence modeling and alignment. Circle loss is introduced to maximize the within-class similarity and minimize the between-class similarity flexibly towards a more definite convergence target. Instead of using random or ambiguous experimental settings, we set a concrete criterion analogous to the standard image-based few-shot learning setting for few-shot action recognition evaluation. Extensive experiments on two datasets demonstrate the effectiveness of our proposed method.</p>

Supervised Contrastive Learning for Few-Shot Action Classification

Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning

Few-Shot Classification with Contrastive Learning

Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition

SPContrastNet: A Self-Paced Contrastive Learning Model for Few-Shot Text Classification

Cross-Modal Contrastive Learning Network for Few-Shot Action Recognition

Boosting Few-Shot Classification with View-Learnable Contrastive Learning

Supervised Contrastive Representation Embedding Based on Transformer for Few-Shot Classification

Auto-view Contrastive Learning for Few-Shot Image Recognition

Adaptive Feature Representation Based on Contrastive Learning for Few-Shot Classification

ContrastNet: A Contrastive Learning Framework for Few-Shot Text Classification

Spatio-Temporal Self-supervision for Few-Shot Action Recognition.

SCaTNet: A Novel Self-supervised Contrastive Framework with Spatial-Channel Attention and Temporal Transformer for Few-Shot Action Recognition.

Supervised Contrastive Few-Shot Learning for High-Frequency Time Series

Few-shot Fine-Grained Action Recognition Via Bidirectional Attention and Contrastive Meta-Learning

Few-shot action recognition with implicit temporal alignment and pair similarity optimization

Few-shot Action Recognition via Improved Attention with Self-supervision

Enhancing Few-Shot Classification without Forgetting through Multi-Level Contrastive Constraints

Few-Shot Image Classification via Contrastive Self-Supervised Learning

Few-shot Action Recognition with Prototype-centered Attentive Learning

Self-supervised and Weakly Supervised Contrastive Learning for Frame-wise Action Representations