Abstract:Gait recognition is inter-related to pedestrians’ identity. Pedestrians’ gait recognition can be focused on at a distance and it cannot require special acquisition equipment, high image resolution, or explicit cooperation from the person in comparison with recognition methods relevant to the features of face, fingerprint, iris and other biometrics. Moreover, one’s gait is difficult to be hidden or disguised. Gait recognition has a wide range of applications in public surveillance, forensic collection, and daily attendance. In these practical applications, the performance of gait recognition is easily affected by covariates such as viewpoint variations, occlusions, and segmentation error, among which viewpoint variations are one of the main factors affecting the gait recognition performance. The intra-class differences of different viewpoints are often greater than the inter-class differences of the same viewpoint. Therefore, improving the robustness of cross-view gait recognition has become a hot topic. A review of existing cross-view gait recognition methods is critical analyzed. First, current situation is introduced in related to basic concepts, data acquisition methods, application scenarios, and its growing paths.Then, we review video-based cross-view gait recognition methods further. Cross-view gait databases are analyzed in the context of 1) data type, 2) sample size, 3) viewpoint number, 4) acquisition environment, 5) other related covariates, and 6) the characteristics of these databases in detail. Then, cross-view gait classification methods are presented in detail.Unlike most existing reviews that classify gait recognition methods by the basic steps such as data acquisition, feature representation, and classification, we focus on cross-view recognition problems. Specifically, four cross-view gait recognition methods are analyzed on the basis of feature representation and classification（i. e., 3D gait information construction, view transformation model（VTM）, view-invariant feature extraction, and the deep learning-based methods）. For 3D gait information methods, gait information is extracted from multi-view gait videos and it is used to construct 3D gait models. These methods have good robustness to large view changes, but they often require: complex configurations, expensive highresolution multi-camera systems, and frame synchronization. All of them limit their application to real surveillance scenarios. For VTM methods, singular value decomposition（SVD） and regression-derived view transformation models are introduced to local and global features. The discriminative analysis can be ignored although the VTM may minimize the error between the transformed gait features and the original gait features. For view-invariant feature extraction methods, 1) manual feature extraction, 2) discriminative subspace learning, and 3) metric learning are compared. Among the discriminative subspace learning methods, the canonical correlation analysis（CCA） based methods are highlighted. Despite the advantages of these methods, it is still challenged to sort robust view-invariant subspace or metric for features out. Deep learning based methods for cross-view recognition is mainly composed of convolution neural network（CNN）, recurrent neural network（RNN）, auto encoder（AE）, generative adversarial network（GAN）, 3D convolutional neural network（3D CNN）, and graph convolutional network（GCN）. To summary the potentials of multiple cross-view gait recognition methods, some representative state-of-the-art methods are compared and analyzed further on CASIA-B（CASIA gait database, dataset B）, OU-ISIR LP（OU-ISIR gait database, large population dataset） and OU-MVLP（OU-ISIR gait database multiview large population dataset） databases. It is found that the methods using 3D CNN or multiple neural network architectures, which represent gait features with a sequence of silhouettes, achieve good performance. Additionally, deep neural network methods based on body model representation also show excellent performance under the condition with only view variations. Finally, future research directions are predicted for cross-view gait recognition, including 1) the establishment of large-scale gait databases containing complex covariates, 2) cross-database gait recognition, 3) self-supervised learning methods for gait features, 4) disentangled representation learning methods for gait features, 5) further developing modelbased gait representation methods, 6) exploring new methods for temporal feature extraction, 7) multimodal fusion gait recognition, and 8) improving the security of gait recognition systems.

Cross-View Gait Recognition Based on Dual-Stream Network

Gait Recognition Using Multichannel Convolution Neural Networks

Human Gait Recognition Based on Frame-by-Frame Gait Energy Images and Convolutional Long Short-Term Memory

Cross-view Gait Recognition Through Ensemble Learning

A Comprehensive Study on Cross-View Gait Based Human Identification with Deep CNNs

Cross-view Gait Recognition:A Review

GaitSet: Cross-view Gait Recognition through Utilizing Gait as a Deep Set

Human Gait Recognition Based on Self-Adaptive Hidden Markov Model

Non-local Gait Feature Extraction and Human Identification

Attention-Based Network For Cross-View Gait Recognition

Robust Gait Recognition based on Deep CNNs with Camera and Radar Sensor Fusion

On Learning Disentangled Representations for Gait Recognition

GaitSet: Regarding Gait as a Set for Cross-View Gait Recognition

GaitFFDA: Feature Fusion and Dual Attention Gait Recognition Model

Regional Time-Series Coding Network and Multi-View Image Generation Network for Short-Time Gait Recognition

Gait Recognition based on Two-Stream CNNs with Multisensor Progressive Feature Fusion

Wearable Device-Based Gait Recognition Using Angle Embedded Gait Dynamic Images and a Convolutional Neural Network

DCS-Gait: A Class-Level Domain Adaptation Approach for Cross-Scene and Cross-State Gait Recognition Using Wi-Fi CSI

Robust Cross-View Gait Recognition with Evidence: A Discriminant Gait GAN (DiGGAN) Approach

GaitCTCG: cross-view gait recognition via cascaded residual temporal shift and comprehensive multi-granularity learning

Color-mapped contour gait image for cross-view gait recognition using deep convolutional neural network