VGGish transfer learning model for the efficient detection of payload weight of drones using Mel-spectrogram analysis

Eman I. Abd El-Latif,Noha Emad El-Sayad,Kamel K. Mohammed,Ashraf Darwish,Aboul Ella Hassanien
DOI: https://doi.org/10.1007/s00521-024-09661-7
2024-04-23
Neural Computing and Applications
Abstract:Abstract This paper presents an accurate model for predicting different payload weights from 3DR SOLO drone acoustic emission. The dataset consists of eleven different payload weights, ranging from 0 to 500 g with a 50 g increment. Initially, the dataset's drone sounds are broken up into 34 frames, each frame was about 5 s. Then, Mel-spectrogram and VGGish model are employed for feature extraction from these sound signals. CNN network is utilized for classification, and during the training phase, the network's weights are iteratively updated using the Adam optimization algorithm. Finally, two experiments are performed to evaluate the model. The first experiment is performed utilizing the original data (before augmentation), while the second used the augmented data. Different payload weights are identified with a potential accuracy of 99.98%, sensitivity of 99.98%, and specificity of 100% based on experimental results. Moreover, a comprehensive comparison with prior works that utilized the same dataset validates the superiority of the proposed model.
computer science, artificial intelligence
What problem does this paper attempt to address?