Multilingual conversational telephony speech corpus creation for real world speaker diarization and recognition

Soma Khan,Joyanta Basu,Madhab Pal,Rajib Roy,Milton Samirakshma Bepari
DOI: https://doi.org/10.1109/icsda.2016.7919007
2016-10-01
Abstract:Real world speech data specially telephone conversations recorded in different background situations pose greater challenge for automatic speaker diarization and recognition. Limited numbers of real-world conversational speech resources restrict the research and development in this regard. Present work describes the design and development process of a real world multilingual conversational telephony speech corpus considering all kind of uncertainties and adversities of real-time. Conversations in the corpus are entirely transcribed to support the needs of automatic speaker diarization and recognition. The real world nature of the collected data has been verified through corpus statistics in terms of variations in speakers' age group, topic of conversation, gender-pairs in conversation, recording environment and language.
What problem does this paper attempt to address?