A comparative analysis on major key-frame extraction techniques

Jhuma Sunuwar,Samarjeet Borah
DOI: https://doi.org/10.1007/s11042-024-18380-z
IF: 2.577
2024-02-14
Multimedia Tools and Applications
Abstract:Real-time hand gesture recognition involves analyzing static and dynamic gesture videos. Video is a sequential arrangement of images, captured and eventually displayed at a given frequency. Not all video frames are useful and including all frames makes video processing complex. Methods have been devised to remove redundant and identical frames for simplifying video processing. One such approach is key-frame extraction, which involves identifying and retaining only those frames that accurately represent the original content of the video. In this paper, we have empirically analyzed different methods for performing key-frame extraction. Experiment analysis of five key-frame extraction methods based on Simple Frame Extraction, Uniform Sampling, Structural Similarity Index, Absolute Two Frame Difference, Motion Detection, and Error correction based key-frame extraction technique using Visual Geometry Group-16 has been done. Three publicly available datasets DVS gesture, American Sign Language (ASL) gesture, IPN gesture, and two self-constructed NSL_Consonent and NSL_Vowel datasets have been used to evaluate the performance of key-frame extraction methods. NSL_Consonent and NSL_Vowel comprise 37 consonants and 17 vowels of the Nepali Sign Language. Analyzing the experimental results shows that uniform sampling is only suitable for static gestures that don't require any other structural information for selecting keyframes. Performance of Structural Similarity Index, KCKFE based on VGG16, and motion detection-based key-frame extraction is found suitable for dynamic gestures. The two-frame absolute difference method results in poor key-frame generation due to an equal number of frames being generated as present in the video.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?