Two microphone based direction of arrival estimation for multiple speech sources using spectral properties of speech

Wenyi Zhang,Bhaskar D. Rao
DOI: https://doi.org/10.1109/ICASSP.2009.4960053
2009-01-01
Abstract:A two microphone direction of arrival (DOA) estimation technique for multiple speech sources is developed which exploits speech specific properties, namely sparsity in time-frequency (spectrum) domain. For robustness, we exploit the sparsity in the frequency domain by focusing on the spectral content concentrated in sinusoidal tracks obtained through sinusoidal modeling. When multiple speeches are mixed in the two microphone system, the inter-channel phase differences (IPD) between the dual channels on those sinusoidal tracks will be dominated by the spatial information of the most powerful source at that specific time-frequency point because of the spectrum sparsity and masking effects. Thereby, the source localization problem is turned into a clustering problem on the IPD versus frequency plot, and the generalized mixture decomposition algorithm (GMDA) is used to cluster the groups of points corresponding to multiple sources. The DOA of each source is derived from the parameters of each cluster. Experimental results conducted show the scheme to be very effective.
What problem does this paper attempt to address?