Encoding Learning Network Combined with Feature Similarity Constraints for Human Action Recognition

Chao Wu,Yakun Gao,Guang Li,Chunfeng Shi
DOI: https://doi.org/10.1007/s11042-023-17424-0
IF: 2.577
2023-01-01
Multimedia Tools and Applications
Abstract:Extreme learning machine (ELM) is a fast and efficient classifier. Due to the inability to process descriptor-level features extracted from video sequences, the networks based on ELM cannot be directly used to recognize human actions. Encoding learning network (ELN) is proposed to solve this problem. The network is composed of feature encoding module and double similarity-constrained extreme learning machine (DS-ELM). In feature encoding module, the sparse mapping weight matrix is combined with pyramid pooling to generate representation-level features. DS-ELM is used to classify generated features. In order to utilize the similarity information between the features of each layer, different weight matrices in ELN are separately trained to improve the recognition ability. In the training of sparse mapping weight matrix, the auto-encoded dictionary and similarity constrained linear coding (SCLC) method are proposed to encode the desired output. The sparse mapping weight matrix is trained by using partial descriptor features and corresponding desired outputs. In the training of the classification weights, the ELM objective function is updated by similarity relationship between hidden layer features to derive the training formula of DS-ELM, which improves the classification performance while avoiding iterative training. To verify the feasibility of the ELN, experiments are conducted on Olympic Sports, UCF11, Hollywood2, UCF101, and Self-collection databases. Experimental results show that the proposed ELN is able to directly process descriptor features. And, the similarity information between the features of each layer can be further utilized by ELN to obtain excellent recognition performance compared with other improved methods based on ELM.
What problem does this paper attempt to address?