Abstract:Human activity recognition (HAR) is an emerging field that identifies human actions in different settings. This activity is recognized by sensors placed in the room or residence where we wish to observe human action. Real-world applications and automation employ activity recognition to detect anomalous behavior. For example, the anomalous behavior of patients such as walking while advised to rest in bed and falling elderly people need to be monitored carefully in hospitals as well as in home-based monitoring systems. Security, healthcare, human interaction, and computer vision use it. The activity is monitored through sensors and cameras. There is no general, explicit approach for inferring human activities from sensor data. Sensor data and heuristics present technological challenges. Several elements must be evaluated to build a reliable activity recognition system. Factors such as storage, connectivity, processing, energy efficiency, and system adaptability are important. Deep learning systems can better recognize human ac-tivities from earlier datasets. In this study, the hybrid One Dimensional Convolution Neural Network with Long Short Term Memory (LSTM) classifier is employed to improve the perfor-mance of HAR. It offers a method for automatically and data-adaptively removing reliable characteristics from raw data. This model proposes a two-way classification for abstract and in-dividual activity monitoring. Human activities such as walking, sitting, walking downstairs, walking upstairs, laying, and standing along with mobile phone usage are considered in this study. We also compare state-of-the-art algorithms such as Support Vector Machine (SVM), K -Nearest Neighbour (KNN), Long Short Term Memory (LSTM), and Convolutional Neural Network (CNN). The UCI-HAR dataset is used for recognizing human activity in the proposed work. Fea-tures such as mean, median, and autoregressive coefficients are derived from the raw data and processed with principal component analysis to make them more reliable. The LSTM model ac-cepts a series of activities, whereas the CNN accepts a single input. The CNN takes the single input data and each of the outputs is forwarded to the LSTM model, which classifies the activity. The Hybrid model achieves 97.89% accuracy with the new feature selection methods, whereas the CNN and LSTM individually produce 92.77% and 92.80% accuracy.

Multi-view Multi-modal Approach Based on 5S-CNN and BiLSTM Using Skeleton, Depth and RGB Data for Human Activity Recognition

Deep learning-based multi-view 3D-human action recognition using skeleton and depth data

A Multimodal Fusion Approach for Human Activity Recognition

Skeleton Focused Human Activity Recognition in RGB Video

Enhancing Human Activity Recognition through Integrated Multimodal Analysis: A Focus on RGB Imaging, Skeletal Tracking, and Pose Estimation

Human Action Recognition of Spatiotemporal Parameters for Skeleton Sequences Using MTLN Feature Learning Framework

Bimodal HAR-An Efficient Approach to Human Activity Analysis and Recognition Using Bimodal Hybrid Classifiers.

Skeleton-Indexed Deep Multi-Modal Feature Learning for High Performance Human Action Recognition

Human-centric multimodal fusion network for robust action recognition

Evaluating fusion of RGB-D and inertial sensors for multimodal human action recognition

Multimodal human action recognition based on spatio-temporal action representation recognition model

A Comprehensive Methodological Survey of Human Activity Recognition Across Divers Data Modalities

Human activity recognition using deep learning approaches and single frame cnn and convolutional lstm

Multi-dimensional CNN Based Feature Extraction with Feature Fusion and SVM for Human Activity Recognition in Surveillance Videos

Dynamic Edge Convolutional Neural Network for Skeleton-Based Human Action Recognition

A multimodal approach for human activity recognition based on skeleton and RGB data

Multi-STMT: Multi-Level Network for Human Activity Recognition Based on Wearable Sensors

Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition

Multi-Stage Based Feature Fusion of Multi-Modal Data for Human Activity Recognition

Multi-view key information representation and multi-modal fusion for single-subject routine action recognition

A resource conscious human action recognition framework using 26-layered deep convolutional neural network