Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models

Ziqiang Shi,Jiqing Han,Tieran Zheng,Ji Li
DOI: https://doi.org/10.1109/tasl.2012.2229980
2013-01-01
IEEE Transactions on Audio Speech and Language Processing
Abstract:In this paper, we generalize the Gaussian Mixture Model (GMM) in two ways: a) by introducing novel distance measures between two vectors based on nonlinear maps to give more general mixture models; b) by building mixture models based on multiple different kinds of distributions. These two generalizations cope with different problems arisen in feature modeling. Mixture model obtained by first method is called pseudo Gaussian Mixture Model (pseudo GMM). Compared to the traditional GMM, pseudo GMM with nonlinear maps have better performance on nonlinear problems, while the computational complexity is almost the same as the Expectation-Maximization (EM) algorithm for traditional GMM according to the iteration procedures. The second generalization considers that in practice the practical learning problem often involves multiple, heterogeneous data sources, while classical mixture models are based on a single kind of distribution. In this work, we consider heterogeneous mixture models (hetMM) based on multiple different kinds of distributions. Different types of distributions in hetMM may have quite different properties and may capture different features of the data. Component classifiers including pseudo and hetMM based classifiers are employed in our task of erotic audio recognition. Experimental results with classifiers built based on pseudo GMM and hetMM for erotic audio recognition demonstrate the effectiveness of the proposed model. Online and off-line experiments show that the proposed approach is highly effective for erotic audio recognition.
What problem does this paper attempt to address?