Robotic voice assistant equipped with binaural audio

Mingsian R Bai,Yi-Cheng,Tsung-Han,Wen-Chuen
2019-01-01
Abstract:A robotic voice assistant is constructed for entertaining service. The robot is comprised primarily of three functional units: a microphone array, a cloud-based voice assistant and a binaural rendering loudspeaker array. The microphone array is utilized to locate the human user and to extract the speech commands given by the user. The extracted commands are then sent to a convolution neural network built ourselves. The response from the cloud is broadcast at the robot end by using a linear loudspeaker array. Various binaural processing modes are implemented in light of a special inverse filtering approach. The inverse filters are formulated in the time domain, which makes it immune to the noncausal artifacts such as wraparound errors and pre-ringing that are frequently encountered in the frequency-domain formulations. However, the frequency-domain weighting and equalization is still possible in the proposed approach. An industrial personal computer serves as the coordinator of the preceding processing units. With these 3 units working in tandem, the proposed robot is capable of interpreting human commands and responding with immersive binaural audio.
What problem does this paper attempt to address?