Adaptive DCTNet for Audio Signal Classification

Yin Xian,Yunchen Pu,Zhe Gan,Liang Lu,Andrew Thompson

DOI: https://doi.org/10.1121/1.4970932

2017-04-30

Abstract:In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.

Sound

What problem does this paper attempt to address?

This paper attempts to solve several key problems in audio signal classification: 1. **Improvement of feature representation**: Although traditional audio signal features (such as Mel - Frequency Cepstral Coefficients (MFCC) and ERB - rate scale features) can reveal the intrinsic properties of audio signals, they are sensitive to noise. The paper proposes a new feature extraction method - Adaptive DCTNet (A - DCTNet), aiming to improve the robustness and effectiveness of feature representation. 2. **Capture of low - frequency information**: The human auditory system is particularly sensitive to low - frequency information, while traditional methods perform poorly in capturing low - frequency information. A - DCTNet can better capture low - frequency acoustic information by using geometrically spaced center - frequency filters, thus improving the quality of feature representation. 3. **Optimization of time - frequency analysis**: The paper proves that the output of the two - layer DCTNet belongs to the time - frequency distribution of the Cohen class, which indicates that DCTNet has a good theoretical basis in time - frequency analysis. This property enables DCTNet to effectively handle the complex structure of audio signals. 4. **Combination with Recurrent Neural Network (RNN)**: In order to utilize the sequence information of audio signals, the paper combines A - DCTNet with RNN, further improving the classification performance. RNN can capture the long - term dependencies in audio signals, thus achieving state - of - the - art classification results on music data and bird song data. In summary, this paper mainly focuses on improving the accuracy and robustness of audio signal classification by improving the feature extraction method, and especially makes innovative explorations in low - frequency information capture and time - frequency analysis.

Adaptive DCTNet for Audio Signal Classification

Automatic Respiratory Sound Classification Via Multi-Branch Temporal Convolutional Network

A Deep Neural Network for Audio Classification with a Classifier Attention Mechanism

Audio Scanning Network: Bridging Time and Frequency Domains for Audio Classification

Simplified inverse filter tracked affective acoustic signals classification incorporating deep convolutional neural networks

Deep Neural Network Based Environment Sound Classification and Its Implementation on Hearing Aid App

Audio-Based Music Classification with DenseNet And Data Augmentation

Hierarchical-Concatenate Fusion TDNN for sound event classification

AmtNet: Attentional multi-scale temporal network for phonocardiogram signal classification

Densely Connected Networks with Multiple Features for Classifying Sound Signals with Reverberation

Deep Neural Network Derived Bottleneck Features For Accurate Audio Classification

Using Deep Belief Network to Capture Temporal Information for Audio Event Classification.

Spectral and Rhythm Features for Audio Classification with Deep Convolutional Neural Networks

DD-CNN: Depthwise Disout Convolutional Neural Network for Low-complexity Acoustic Scene Classification

Sample Dropout for Audio Scene Classification Using Multi-Scale Dense Connected Convolutional Neural Network

Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network

Augmented TDNN for frequency and scale invariant sequence classification

An Auditory Convolutional Neural Network for Underwater Acoustic Target Timbre Feature Extraction and Recognition

Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks

Effective Sample Selection and Enhancement of Long Short-Term Dependencies in Signal Detection: HDC-Inception and Hybrid CE Loss

Advanced Framework for Animal Sound Classification With Features Optimization