Abstract:Audio Sentiment Analysis is a popular research area which extends the text-based sentiment analysis to depend on effectiveness of acoustic features extracted from speech. However, current progress on audio sentiment analysis mainly focuses on extracting homogeneous acoustic features or doesn't fuse heterogeneous features effectively. In this paper, we propose an utterance-based deep neural network model, which has a parallel combination of CNN and LSTM based network, to obtain representative features termed Audio Sentiment Vector (ASV), that can maximally reflect sentiment information in an audio. Specifically, our model is trained by utterance-level labels and ASV can be extracted and fused creatively from two branches. In the CNN model branch, spectrum graphs produced by signals are fed as inputs while in the LSTM model branch, inputs include spectral centroid, MFCC and other recognized traditional acoustic features extracted from dependent utterances in an audio. Besides, BiLSTM with attention mechanism is used for feature fusion. Extensive experiments have been conducted to show our model can recognize audio sentiment precisely, and demonstrate our ASV are better than traditional acoustic features or vectors extracted from other deep learning models. Furthermore, experimental results indicate that the proposed model outperforms state-of-the-art approaches by 9.33% on MOSI.

Audio classification based on maximum entropy model

Maximum Likelihood I-Vector Space Using PCA for Speaker Verification.

Audio Sentiment Analysis by Heterogeneous Signal Features Learned from Utterance-Based Parallel Neural Network.

A Method Based on General Model and Rough Set for Audio Classification

An Approach Based on Reverse Hidden Markov Model for Audio Classification

Semi-supervised Minimum Redundancy Maximum Relevance Feature Selection for Audio Classification

Research on Chinese Person Name and Location Name Recognition Based on Maximum Entropy Model

Prosodic word prediction using a maximum entropy approach

Prosodic boundary prediction based on maximum entropy model with error-driven modification

Robust Discriminant Analysis Based on Nonparametric Maximum Entropy

Performance evaluation based on maximum entropy Markov model

Research and Implementation of Identifying Music through Performances Using Entropy Based Audio-fingerprint

An Improved Maximum Entropy Language Model and Its Application

Auditory Context Classification Using Random Forests

Sub-Band Optimization with Criterion of Maximum Weighting Entropy and Its Application in Pattern Classification

Feature Analysis for Speech/Music Automatic Classification

Efficient representation and fast look-up of Maximum Entropy language models.

Music Classification Via the Bag-of-features Approach.

Hybrid Independent Component Analysis and Rough Set Approach for Audio Feature Extraction

Modulation Signal Recognition Based on Information Entropy and Ensemble Learning

SemanticAC: Semantics-Assisted Framework for Audio Classification