Exploration of deep learning architectures for real-time yoga pose recognition
Sumeet Saurav,Prashant Gidde,Sanjay Singh,Gidde, Prashant
DOI: https://doi.org/10.1007/s11042-024-18694-y
IF: 2.577
2024-03-09
Multimedia Tools and Applications
Abstract:Yoga has become an essential part of modern life, and hence, there has been a tremendous demand for self-training yoga platforms for trainer-less yoga practice. Robust and efficient recognition of yoga poses in video stream is the first requirement of such systems. However, the existing techniques for yoga pose recognition are compute-intensive and may fail in complex real-world conditions. These issues pose serious limitations on their practical applicability. To this end, this paper examines state-of-the-art deep learning techniques to implement a robust and compute-efficient system for yoga pose recognition in real-time on a resource-constrained embedded platform. The first technique uses a hybrid CNN & LSTM model, while the other three (3DCNN Model1, 3DCNN Model2, and 3DCNN Model3) employ the Sports1M pre-trained 3DCNN model named C3D. We assessed the performance of the designed architectures on a publicly available yoga pose database by applying four well-known metrics, namely recognition accuracy, precision, recall, and F1-score. On three database splits, the designed hybrid CNN & LSTM, 3DCNN Model1, 3DCNN Model2, and 3DCNN Model3 achieved mean recognition accuracy of 98.80%, 99.07% 98.19%, and 98.43%, respectively. Also, on one of the splits, the best-performing model achieved the highest recognition accuracy of 99.65% and, thus, surpassed the baseline accuracy of 99.38%. Also, the optimal model runs at a frame rate of 31 FPS on an Nvidia GPU-enabled desktop, much better than the previous best of 3 FPS. Finally, to evaluate the model's efficiency on embedded systems, we optimized it using TensorRT SDK and deployed it on an Nvidia Xavier embedded platform. The optimized model runs at 8 FPS on the resource-constrained embedded platform, demonstrating its suitability for real-world applications. A working demo of the developed system is available at https://youtu.be/at1GJ8Nxx38, and the source codes are available at https://github.com/sumeetssaurav/Yoga-Pose-Classification.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering