Feature Analysis for Speech/Music Automatic Classification

卢坚,陈毅松,孙正兴,张福炎
DOI: https://doi.org/10.3321/j.issn:1003-9775.2002.03.010
2002-01-01
Abstract:Discriminating features between speech and music are analyzed, including perceptual features like pitch, brightness and harmonicity, etc, and Mel-Frequency Cepstral Coefficients (MFCC). Their performances are evaluated in a left-right discrete HMM-based audio classifier, which is used to classify audio into speech, music, their mixed sound and such-like three categories with maximum likelihood criterion. The experiment results show that the features selected are effective for speech/music classification, and the classification accuracy is excellent.
What problem does this paper attempt to address?