Harmonizing space–time dynamics for precision in human action recognition

Abdul Majid,Yulin Wang,Anwar Ullah,Fahim Naiz,Muhammad Zarar
DOI: https://doi.org/10.1007/s11042-024-20407-4
IF: 2.577
2024-11-20
Multimedia Tools and Applications
Abstract:Human action recognition (HAR) plays a vital role in various fields, including surveillance, healthcare, and human–computer interaction. Recognizing multi-actor actions in crowded scenarios poses a significant challenge due to the complex dynamics and interactions among individuals. Existing methods often fail to effectively capture both spatial and temporal dependencies in video frames, leading to inaccuracies in recognizing diverse action classes under real-world conditions. In this paper, we introduce a novel framework called Video Capsule Net-Multi Actor (VCN-MA), designed to enhance the precision of HAR by harmonizing space–time dynamics. The VCN-MA architecture incorporates a deep neural network combining convolutional and recurrent layers to capture spatial and temporal features. Additionally, we leverage pre-trained embeddings and attention mechanisms to focus on critical features during action sequences. The proposed model utilizes capsule networks with dynamic routing to represent complex hierarchical spatial–temporal relationships, further refining action recognition accuracy. We evaluate our approach using the MAMA dataset, which includes 35 diverse action classes under varied real-world conditions such as different viewpoints, occlusions, and lighting variations. The VCN-MA model achieves an overall accuracy of 85% on the test set, outperforming existing state-of-the-art methods. Visualizations, including confusion matrices and ROC curves, along with metrics like the Matthews Correlation Coefficient (MCC), provide a comprehensive analysis of the model's performance across different action classes. These results demonstrate the potential applicability of VCN-MA in practical scenarios, including video surveillance, human–computer interaction, and healthcare applications.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?