An Improved Speaker Diarization System for Multiple Distance Microphone Meetings

Zhou Yu,Suo Hongbin,Wang Junjie,Yan Yonghong
DOI: https://doi.org/10.1109/icicta.2012.27
2012-01-01
Abstract:This paper describes an improved speaker diarization system for multiple distance microphone (MDM) meeting conversations. First, the new system includes a modified speech activity detector (SAD). Second, it adopts the new spectral features based on equivalent rectangular bandwidth (ERB) or bark scale, which are compared with the traditional Mel Frequency Cepstral Coefficients (MFCC) features. Third, the system adapts the segment model from a universal background model (UBM). Finally, it is evaluated in the NIST RT-04s MDM conditions. Experimental results show that: (1) the new speech/non-speech detector out-performs the one in the baseline system, (2) the proposed spectral features are more effective than MFCC features for speaker diarization, (3) The adaptation of segment models from UBM helps improving the system performance. Together, these improvements lead to the diarization error rate of 15.38% on RT-04s evaluation data excluding overlapping speech.
What problem does this paper attempt to address?