A2SN: attention based two stream network for sports video classification

Ray, Abhisek,Aslam, Nazia
DOI: https://doi.org/10.1007/s11042-024-18375-w
IF: 2.577
2024-02-09
Multimedia Tools and Applications
Abstract:In this digital age, 3D data interpretation has emerged as a significant research area in which videos are the most extensively utilized electronic medium for data transfer. The appropriate classification of video data is critical for storing and broadcasting. In this paper, we have introduced a novel attention-based two-stream deep neural network (A2SN) for sports video classification. The first stream of A2SN is a transfer learning-based model that transfers the pre-trained weights from base to head model for spatial feature learning. In contrast, the second stream is an attention-based feature extractor module used to learn the spatiotemporal features of video data. After that, the features obtained from the transfer learning-based and attention-based models are concatenated together and passed through the dense layer for the classification task. An extensive experiment has been performed on UCF50 and SVW datasets, and an accuracy of 99.26% and 97.3% is achieved, respectively, that validate the efficacy of the proposed model. We have also compared the performance of A2SN on the UCF101 dataset and achieved an accuracy of 97.1% to check the model's generalizability.
computer science, information systems, theory & methods,engineering, electrical & electronic, software engineering
What problem does this paper attempt to address?