Gun identification from gunshot audios for secure public places using transformer learning

Rahul Nijhawan,Sharik Ali Ansari,Sunil Kumar,Fawaz Alassery,Sayed M El-Kenawy
DOI: https://doi.org/10.1038/s41598-022-17497-1
2022-08-02
Abstract:Increased mass shootings and terrorist activities severely impact society mentally and physically. Development of real-time and cost-effective automated weapon detection systems increases a sense of safety in public. Most of the previously proposed methods were vision-based. They visually analyze the presence of a gun in a camera frame. This research focuses on gun-type (rifle, handgun, none) detection based on the audio of its shot. Mel-frequency-based audio features have been used. We compared both convolution-based and fully self-attention-based (transformers) architectures. We found transformer architecture generalizes better on audio features. Experimental results using the proposed transformer methodology on audio clips of gunshots show classification accuracy of 93.87%, with training loss and validation loss of 0.2509 and 0.1991, respectively. Based on experiments, we are convinced that our model can effectively be used as both a standalone system and in association with visual gun-detection systems for better security.
What problem does this paper attempt to address?