Voice-Indistinguishability: Protecting Voiceprint in Privacy-Preserving Speech Data Release

Yaowei Han,Sheng Li,Yang Cao,Qiang Ma,Masatoshi Yoshikawa
DOI: https://doi.org/10.48550/arXiv.2004.07442
2020-04-16
Abstract:With the development of smart devices, such as the Amazon Echo and Apple's HomePod, speech data have become a new dimension of big data. However, privacy and security concerns may hinder the collection and sharing of real-world speech data, which contain the speaker's identifiable information, i.e., voiceprint, which is considered a type of biometric identifier. Current studies on voiceprint privacy protection do not provide either a meaningful privacy-utility trade-off or a formal and rigorous definition of privacy. In this study, we design a novel and rigorous privacy metric for voiceprint privacy, which is referred to as voice-indistinguishability, by extending differential privacy. We also propose mechanisms and frameworks for privacy-preserving speech data release satisfying voice-indistinguishability. Experiments on public datasets verify the effectiveness and efficiency of the proposed methods.
Cryptography and Security,Sound,Audio and Speech Processing
What problem does this paper attempt to address?