Abstract:This paper presents a new approach to classify environmental sounds using a texture feature local binary pattern (LBP) and audio features collaboration. To our knowledge, this is the first time that the LBP (or its variants), which has a proven track record in the field of image recognition and classification, has been generalized for 1D and combined with audio features for an environmental sound classification task. To this end, we have generalized and defined LBP-1D and local phase quantization (LPQ)-1D on the 1-dimensional (1D) audio signal and have applied the original LBP, the variance LBP (VARLBP) and the extended LBP (ELBP) thus generated to the spectrogram of the audio signal in order to model the sound texture. We have also extensively compared these new LBP-based features to the classical audio descriptors commonly used in environmental sound classification, such as MFCC, GFCC, CQT, chromagram, STE and ZCR. We have evaluated our algorithm on ESC-10 and ESC-50 datasets using classical machine learning algorithms, such as support vector machines (SVM), random forest and k-nearest neighbor (kNN). The results showed that the LBP features outperform the classical audio features. We mix the LBP features with the audio descriptors, and our best mixed model achieves state-of-the-art results for environmental sound classification: 88.5<span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="0" height="0.343ex" style="vertical-align: -0.171ex;" viewBox="0 -73.8 0 147.5" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"></g></svg></span> on ESC-10 and 64.6<span class="mjpage"><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="0" height="0.343ex" style="vertical-align: -0.171ex;" viewBox="0 -73.8 0 147.5" role="img" focusable="false" xmlns="http://www.w3.org/2000/svg"><g stroke="currentColor" fill="currentColor" stroke-width="0" transform="matrix(1 0 0 -1 0 0)"></g></svg></span> on ESC-50. Those results outperform the results of methods that used handcrafted features with classical machine learning algorithms and are similar to some convolutional neural network-based methods. Although our method is not the cutting edge of the state-of-the-art methods, it is faster than any convolutional neural network methods and represents a better choice when there is data scarcity or minimal computing power.<svg xmlns="http://www.w3.org/2000/svg" style="display: none;"><defs id="MathJax_SVG_glyphs"></defs></svg>

Enhanced Class-Dependent Classification of Audio Signals

Robust Audio Sensing with Multi-Sound Classification.

Multi-level Attention Model with Deep Scattering Spectrum for Acoustic Scene Classification.

Hierarchical classification for acoustic scenes using deep learning

A study of audio classification on using different feature schemes with three classifiers

A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification

Semi-supervised Feature Selection for Audio Classification Based on Constraint Compensated Laplacian Score

Audio Enhancement and Intelligent Classification of Household Sound Events Using a Sparsely Deployed Array

Multi-level distance embedding learning for robust acoustic scene classification with unseen devices

Adaptive DCTNet for Audio Signal Classification

Low-Complexity Acoustic Scene Classification Using Data Generation Based on Primary Ambient Extraction.

Audio Scanning Network: Bridging Time and Frequency Domains for Audio Classification

Ecological Environmental Sounds Classification Based on Genetic Algorithm and Matching Pursuit Sparse Decomposition

Acoustic bird species classification under low SNR and small-scale dataset conditions

Hierarchical Support Vector Machines for Audio Classification

MFCC combined with sparse coding for sound event classification under different noise environments

An Automatic Approach Towards Audio Segmentation And Classification

Binaural Acoustic Scene Classification Using Wavelet Scattering, Parallel Ensemble Classifiers and Nonlinear Fusion

Acoustic Scene Classification Across Cities and Devices via Feature Disentanglement

Environmental Sound Classification Using Local Binary Pattern and Audio Features Collaboration

Advanced Framework for Animal Sound Classification With Features Optimization