Abstract:Human action recognition from videos requires a sequence of frames to be analyzed involving high processing power and time. Moreover, subsequent frames in a video contain redundancy that does not contribute to extracting distinguishable features but rather consumes processing resources. To recognize action with lower processing overhead, researchers are focusing on key frames and have proposed different techniques based on inter-frame clustering, fix-frame interval, and inter-frame differences. Clustering and fixed interval approaches consider the predefined number of frames or fixed intervals, therefore, not equally effective for both faster and slower actions which differ in terms of frequency of change. Furthermore, all existing approaches consider pixels? Intensities only but neglect another equally important aspect of motion, i.e., direction. The three-channel optical flows are one of the motion representations that depict both the magnitude and the direction of movement between pairs of frames in the form of colors. Here we proposed a novel "KFF-algorithm" that processes a sequence of three-channel optical flows to extract key flow frames for action recognition. Being dynamic in terms of frame interval, it efficiently extracts key frames for both slower and faster actions by analyzing changes in direction, magnitude, and coverage of motion in subsequent frames. Furthermore, in comparison with other approaches, KFF-algorithm covers relatively extended motion patterns with least number of frames or frames with significant change only. Also, for the majority of classes, KFF-algorithm has achieved substantial per-class accuracy when evaluated with the 3D-ConvNet model over the UCF-101 benchmark dataset for human action recognition.

Online learnable keyframe extraction in videos and its application with semantic word vector in action recognition

ActionCLIP: Adapting Language-Image Pretrained Models for Video Action Recognition.

Key Frame Extraction with Attention Based Deep Neural Networks

A novel keyframe extraction method for video classification using deep neural networks

Unsupervised video summarization framework using keyframe extraction and video skimming

Deep Unsupervised Key Frame Extraction for Efficient Video Classification

A novel video abstraction method based on fast clustering of the regions of interest in key frames

Online Kernel-Based Structured Output SVM for Early Expression Detection

Semi-supervised and Deep learning Frameworks for Video Classification and Key-frame Identification

Observe Finer to Select Better: Learning Key Frame Extraction Via Semantic Coherence for Dynamic Facial Expression Recognition in the Wild

An effective Key Frame Extraction technique based on Feature Fusion and Fuzzy-C means clustering with Artificial Hummingbird

Video abstraction based on the visual attention model and online clustering

Accurate Key Frame Extraction Algorithm of Video Action for Aerobics Online Teaching

New Fusional Framework Combining Sparse Selection and Clustering for Key Frame Extraction.

Large Model based Sequential Keyframe Extraction for Video Summarization

Towards Neuro-Symbolic Video Understanding

Extraction method of semantic information of video images

A Novel Key Flow Frame Selection Method for Video Classification

Search-Map-Search: A Frame Selection Paradigm for Action Recognition

Effective Action Recognition with Embedded Key Point Shifts

Storyboard guided Alignment for Fine-grained Video Action Recognition