TAG: A Temporal AttentiveGait Network for Cross-View Gait Recognition
M. Saad Shakeel,Kun Liu,Xiaochuan Liao,Wenxiong Kang
DOI: https://doi.org/10.1109/tim.2024.3497164
IF: 5.6
2024-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Recognizing a person from a distance using gait (i.e., walking pattern) is a challenging yet interesting biometric problem. Despite recent advancements in deep learning-based gait recognition (GR) research, learning discriminative gait temporal representation is still challenging because of delicate silhouette differences in the spatial domain. Aiming to address this issue, we propose a novel attention-based GR framework, namely Temporal Attentive-Gait (TAG), which aims to refine the gait feature representation from the temporal dimension’s perspective in a comprehensive fashion. Our proposed TAG mainly consists of three modules, namely, short-term temporal feature learning (ST-TFL), hybrid multi-kernel temporal attention (H-MKTA), and multi-kernel temporal self-attention (MK-TSA), respectively. Firstly, ST-TFL aims to capture local temporal contextual clues, facilitating the learning of short period temporal motion patterns. Secondly, H-MKTA learns locally and globally distributed gait temporal information by adaptively capturing the multi-scale temporal evolutions inside the gait sequential data. To refine the temporal attentive features learned by ST-TFL and H-MKTA, our MK-TSA learns global dependencies between gait temporal frames to recalibrate temporal weights using a self-attention mechanism. To further enhance the discriminative power of the gait feature representation, a multi-level framework is adopted, combining gait features from different levels of the backbone. Experiments conducted on three benchmark gait datasets, CASIA-B, Gait3D, and CCPG, demonstrate the strong potential of our TAG in learning effective gait representation under complex scenarios for recognition.