LiveProbe: Exploring Continuous Voice Liveness Detection Via Phonemic Energy Response Patterns.

Hangcheng Cao,Hongbo Jiang,Daibo Liu,Ruize Wang,Geyong Min,Jiangchuan Liu,Schahram Dustdar,John C. S. Lui
DOI: https://doi.org/10.1109/jiot.2022.3228819
IF: 10.6
2023-01-01
IEEE Internet of Things Journal
Abstract:Voice assistants support contactless smart device control and thus act as a holy grail of human–computer interaction. However, recent studies reveal that an adversary can manipulate devices by vicious voice commands. This security risk is caused by only executing one-time liveness detection and lacking safeguard modules after service activation. Therefore, identifying speaker type (i.e., human articulators or loudspeakers) is critical in protecting voice-driven services during an entire interaction session. In this article, we propose a continuous voice liveness detection approach LiveProbe, leveraging unique energy response patterns in frequency bands induced by distinct voice generation mechanisms. The rationality behind LiveProbe is presented in two aspects: human articulator reshapes initial voices by exquisitely coordinated movements of vocal organs, which act as band-pass filters generating unique energy responses; nevertheless, the internal modules of loudspeakers are position fixed and cannot reproduce this response characteristic. To that end, we first work on voice generation mechanisms behind two-type speakers that cause spectrum differences. Then, we elaborately construct signal processing and deep-learning modules to extract liveness features. Especially, our approach does not interfere with normal voice interaction and need not to carry customized sensors. The experiment presents its effectiveness against potential attacks with a false acceptance rate of 0.51%.
What problem does this paper attempt to address?