Video Abnormal Behavior Recognition and Trajectory Prediction Based on Lightweight Skeleton Feature Extraction

Ling Wang,Cong Ding,Yifan Zhang,Tie Hua Zhou,Wei Ding,Keun Ho Ryu,Kwang Woo Nam
DOI: https://doi.org/10.3390/s24123711
IF: 3.9
2024-06-08
Sensors
Abstract:Video action recognition based on skeleton nodes is a highlighted issue in the computer vision field. In real application scenarios, the large number of skeleton nodes and behavior occlusion problems between individuals seriously affect recognition speed and accuracy. Therefore, we proposed a lightweight multi-stream feature cross-fusion (L-MSFCF) model to recognize abnormal behaviors such as fighting, vicious kicking, climbing over the wall, et al., which could obviously improve recognition speed based on lightweight skeleton node calculation, and improve recognition accuracy based on occluded skeleton node prediction analysis in order to effectively solve the behavior occlusion problem. The experiments show that our proposed All-MSFCF model has a video action recognition average accuracy rate of 92.7% for eight kinds of abnormal behavior recognition. Although our proposed lightweight L-MSFCF model has an 87.3% average accuracy rate, its average recognition speed is 62.7% higher than the full-skeleton recognition model, which is more suitable for solving real-time tracing problems. Moreover, our proposed Trajectory Prediction Tracking (TPT) model could real-time predict the moving positions based on the dynamically selected core skeleton node calculation, especially for the short-term prediction within 15 frames and 30 frames that have lower average loss errors.
engineering, electrical & electronic,instruments & instrumentation,chemistry, analytical
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve This paper primarily aims to address the issues of abnormal behavior recognition and trajectory prediction in videos. Specifically: 1. **Improving Recognition Speed and Accuracy**: - Due to the large number of skeleton nodes in practical application scenarios and the severe impact of behavior occlusion between individuals on recognition speed and accuracy, a lightweight multi-stream feature cross-fusion (L-MSFCF) model is proposed. By calculating lightweight skeleton nodes, the recognition speed is significantly improved, and the recognition accuracy is enhanced through occluded skeleton node prediction analysis. 2. **Solving Occlusion Problems**: - To address the issue of occluded skeleton node information, the skeleton node data from past frames is used to predict the occluded skeleton node information before feature extraction, thereby improving recognition accuracy. When tracking abnormal targets, to cope with situations where the target is occluded or disappears, a trajectory prediction tracking (TPT) model is proposed. 3. **Real-time Trajectory Prediction**: - The proposed TPT model can calculate real-time predicted moving positions based on dynamically selected core skeleton nodes, particularly achieving lower average loss errors in short-term (15-frame and 30-frame) predictions. In summary, this paper aims to solve the issues of speed, accuracy, and occlusion in abnormal behavior recognition in video surveillance by proposing new models and methods, and to achieve real-time trajectory prediction.