Abstract:ABSTRACT Automatic discrimination of speech and music is an im-portant tool in many multimedia applications. The pa-per presents a low complexity but effective approach forspeech/music discrimination, which exploits only one sim-ple feature, called Warped LPC-based Spectral Centroid(WLPC-SC). A three-component Gaussian Mixture Model(GMM) classiﬁer is used because it showed a slightly bet-ter performance than other Statistical Pattern Recognition(SPR) classiﬁers. Comparison between WLPC-SC andthe timbral features proposed in [11] is performed, aimingto assess the good discriminatory power of the proposedfeature. Experimental results reveal that our speech/musicdiscriminator is robust and fast, making it suitable for real-time multimedia applications. 1. INTRODUCTION Automatic discrimination between speech and music hasbecome a research topic of interest in the last few years.Several approaches have been described in the recent lit-erature for different applications [1][2] [3][4][5]. Eachof these uses different features and pattern classiﬁcationtechniques and describes results on different material.Saunders [1] proposed a real-time speech/music dis-criminator, which was used to automatically monitor theaudio content of FM audio channels. Four statistical fea-tures on the zero-crossing rate and one energy-related fea-ture were extracted, a multivariate-Gaussian classiﬁer wasapplied, which resulted in an accuracy of 98%.In Automatic Speech Recognition (ASR) of broadcastnews, it’s desirable to disable the input to the speech rec-ognizer during the non-speech portion of the audio stream.Scheirer and Slaney [2] developed a speech/music dis-crimination system for ASR of audio sound tracks. Thir-teen features to characterize distinct properties of speechand music, and three classiﬁcation schemes (MAP Gaus-sian, GMM and k-NN classiﬁers) were exploited, result-ing in an accuracy of over 90%.Another application that can beneﬁt from distinguish-ing speech from music is low bit-rate audio coding. De-signing an universal coder to reproduce well both speechand music is the best approach. However, it is not a trivialproblem. An alternative approach is to design a multi-mode coder that can accommodate different signals. Theappropriate module is selected using the output of a speech-music classiﬁer [6] [7].Automatic discrimination of speech and music is animportant tool in many multimedia applications. KhaledEl-Maleh et al. [3] combined the line spectral frequen-cies and zero-crossings-based features for frame-level nar-rowband speech/music discrimination. The classiﬁcationsystem operates using only a frame delay of 20 ms, mak-ing it suitable for real-time multimedia applications. Anemerging multimedia application is content-based index-ing and retrieval of audiovisual data. Audio content analy-sis is an important task for such application [8]. Minami etal. [9] proposed an audio-based approach to video index-ing, where a speech/music detector is used to help users tobrowse a video database.Comparative view of the value of different types of fea-tures in speech music discrimination is provided in [10],where four types of features (amplitudes, cepstra, pitchand zero-crossings) are compared for discriminating speechand music signals. Experimental results showed cepstraand delta cepstra bring the best performance. Mel Fre-quencies Spectral or Cepstral Coefﬁcients (MFSC or MFCC)are very often used features for audio classiﬁcation tasks,providing quite good results. In [4], MFSC’s ﬁrst orderstatistics are combined with neural networks to form aspeech music classiﬁer that is able to generalize from alittle amount of learning data. MFCC are a compact rep-resentation of the spectrum of an audio signal taking intoaccount the nonlinear human perception of pitch, as de-scribed by the mel scale. They are one of the most usedfeatures in speech recognition and have recently proposedin musical genre classiﬁcation of audio signals [11][12].Unlike the previous works, speech/music discrimina-tion approaches based on only one type of features arepresented in [13] and [5], which result in fast and robustclassiﬁcation systems. The approach in [13] takes psy-choacoustic knowledge into account in that it uses the lowfrequency modulation amplitudes over 20 critical bands toform a good discriminator for the task, while the approachin [5] exploits a new energy-related feature, called mod-iﬁed low energy ratio, that improves the results obtainedwith the classical low energy ratio.In this paper, we present our contribution to the de-sign of a robust speech/music discrimination system. Thepaper presents a low complexity but effective approach,which also exploits only one simple feature, called Warped

Plasma concentration profiles of gonadotrophins and testosterone in the adult boar

Gonadotropic activity in bovine serum and placental tissue.

Feedback Regulation of Gonadotropic Hormone Secretion in Neonatal Pigs

Positive Association Between Expression of Follicle-Stimulating Hormone Beta and Activin Betab-Subunit Genes in Boars.

Effects of treating young boars with a GnRH depot formulation on endocrine functions, testis size, boar taint, carcass composition and muscular structure

Changes in Serum LH Concentrations During Normal and Abnormal Sexual Development in the Pig

Changes in the concentration of follicle-stimulating hormone in plasma during development in the guinea-pig

The rise, fall, and resurgence of immunotherapy in type 1 diabetes.

Body Composition, Serum Lipid Levels, and Transcriptomic Characterization in the Adipose Tissue of Male Pigs in Response to Sex Hormone Deficiency.

Response of Luteinizing Hormone and Follicle-Stimulating Hormone To Luteinizing Hormone Releasing Hormone in the Fetal Pig

Central pain from cerebral abscess

Sertoli cells in the boar testis: changes during development and compensatory hypertrophy after hemicastration at different ages

Experimental observation of surface plasmon vortices with arbitrarily synthesized intensity patterns

Changes in plasma follicle-stimulating hormone, luteinizing hormone, estrogen and progesterone during growth of ovulatory follicles in the pig

NALOXONE ELEVATES PLASMA FOLLICLE STIMULATING HORMONE BUT NOT LUTEINIZING HORMONE LEVELS IN THE IMMATURE MALE PIG

Proteomic analysis of boar seminal plasma: Putative markers for fertility based on capacity of semen preservation at 17°C

[Exercise electrocardiography. The importance of the patient selection for the interpretation of the exercise test].

Day-to-day consistency in amount and source of carbohydrate intake associated with improved blood glucose control in type 1 diabetes.

Protein profiling of testicular tissue from boars with different levels of hyperactive sperm motility

Combined Effect of Dietary Protein, Ractopamine, and Immunocastration on Boar Taint Compounds, and Using Testicle Parameters as an Indicator of Success

New warped LPC-Based Feature for Fast and robust speech/Music Discrimination