Abstract:Gait recognition has attracted increasing attention from academia and industry as a human recognition technology from a distance in non-intrusive ways without requiring cooperation. Although advanced methods have achieved impressive success in lab scenarios, most of them perform poorly in the wild. Recently, some Convolution Neural Networks (ConvNets) based methods have been proposed to address the issue of gait recognition in the wild. However, the temporal receptive field obtained by convolution operations is limited for long gait sequences. If directly replacing convolution blocks with visual transformer blocks, the model may not enhance a local temporal receptive field, which is important for covering a complete gait cycle. To address this issue, we design a Global-Local Temporal Receptive Field Network (GLGait). GLGait employs a Global-Local Temporal Module (GLTM) to establish a global-local temporal receptive field, which mainly consists of a Pseudo Global Temporal Self-Attention (PGTA) and a temporal convolution operation. Specifically, PGTA is used to obtain a pseudo global temporal receptive field with less memory and computation complexity compared with a multi-head self-attention (MHSA). The temporal convolution operation is used to enhance the local temporal receptive field. Besides, it can also aggregate pseudo global temporal receptive field to a true holistic temporal receptive field. Furthermore, we also propose a Center-Augmented Triplet Loss (CTL) in GLGait to reduce the intra-class distance and expand the positive samples in the training stage. Extensive experiments show that our method obtains state-of-the-art results on in-the-wild datasets, $i.e.$, Gait3D and GREW. The code is available at <a class="link-external link-https" href="https://github.com/bgdpgz/GLGait" rel="external noopener nofollow">this https URL</a>.

Gaitcotr: Improved Spatial-Temporal Representation for Gait Recognition with a Hybrid Convolution-Transformer Framework

Human Gait Recognition Based on Frame-by-Frame Gait Energy Images and Convolutional Long Short-Term Memory

Multi-scale Context-aware Network with Transformer for Gait Recognition

GaitGMT: Global feature mapping transformer for gait recognition

Gait Recognition Using Multichannel Convolution Neural Networks

HorGait: Advancing Gait Recognition with Efficient High-Order Spatial Interactions in LiDAR Point Clouds

GaitCTCG: cross-view gait recognition via cascaded residual temporal shift and comprehensive multi-granularity learning

Spatial Transformer Network on Skeleton‐based Gait Recognition

HorGait: A Hybrid Model for Accurate Gait Recognition in LiDAR Point Cloud Planar Projections

Exploring Self-Supervised Vision Transformers for Gait Recognition in the Wild

Gait-CNN-ViT: Multi-Model Gait Recognition with Convolutional Neural Networks and Vision Transformer

Gait recognition with global–local feature fusion based on swin transformer-3DCNN

GLGait: A Global-Local Temporal Receptive Field Network for Gait Recognition in the Wild

GaitContour: Efficient Gait Recognition based on a Contour-Pose Representation

Gait Identification by Joint Spatial-Temporal Feature.

TAG: A Temporal AttentiveGait Network for Cross-View Gait Recognition

TAG: A Temporal Attentive Gait Network for Cross-View Gait Recognition

GaitTAKE: Gait Recognition by Temporal Attention and Keypoint-guided Embedding

Gaitts: indoor gait recognition with multi-scale temporal-spatial information aggregation

TriGait: Aligning and Fusing Skeleton and Silhouette Gait Data via a Tri-Branch Network

Chrono-Gait Image: A Novel Temporal Template For Gait Recognition