Directional Sound-Capture System with Acoustic Array Based on FPGA

Weiming Xiang,Yu Liu,Yiwei Zhou,Yu Wu
DOI: https://doi.org/10.1109/tim.2023.3334344
IF: 5.6
2024-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:The front-end speech enhancement system is regarded as an essential component for maximizing the performances of smart technology for voice interaction in complicated live acoustic scenes. Existing research has had the following limitations: short-distance detection, poor suppression of nonstationary interferers, and inaccurate estimation of the direction of arrival. To tackle these issues, this article proposes a 48-channel acoustic array system for directional sound capture (DSC). This system implements a field-programmable gate array (FPGA)-based acquisition and signal processing algorithm: broadband acoustic beamformer based on audio-visual (A-V). To the authors’ knowledge, this is the first time that a DSC system that uses A-V for terminal voice interaction has been implemented by FPGA. Experiments were set up in diverse acoustic scenes to evaluate the system’s performance. The results imply that the proposed system can be widely applied to smart scenes in complicated acoustic environments contaminated with intense background noise and competing nonstationary interferers, as well as provide real-time speech recognition and classification.
What problem does this paper attempt to address?