Scalable Identity-Oriented Speech Retrieval

Chaotao Chen,Di Jiang,Jinhua Peng,Rongzhong Lian,Yawen Li,Chen Zhang,Lei Chen,Lixin Fan
DOI: https://doi.org/10.1109/tkde.2021.3127520
IF: 9.235
2021-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:With the prevalence of voice devices in our daily life, speech data is accumulated at an unprecedented speed, forming an invaluable database for security surveillance and financial risk management. In these applications, a key task is given a querying speech snippet to retrieve all speech snippets that are uttered by the same speaker as the querying one, namely Identity-Oriented Speech Retrieval (IO-SR). In this paper, we propose an accuracy and scalable system for IO-SR, which seamlessly integrates speaker modeling and deep indexing techniques. Evaluations on an industrial dataset containing millions of speech snippets show that our system achieves superior performance compared with the state-of-the-art methods.
computer science, information systems, artificial intelligence,engineering, electrical & electronic
What problem does this paper attempt to address?