Abstract:The recent ubiquitous adoption of remote conferencing has been accompanied by omnipresent frustration with distorted or otherwise unclear voice communication. Audio enhancement can compensate for low-quality input signals from, for example, small true wireless earbuds, by applying noise suppression techniques. Such processing relies on voice activity detection (VAD) with low latency and the added capability of discriminating the wearer's voice from others - a task of significant computational complexity. The tight energy budget of devices as small as modern earphones, however, requires any system attempting to tackle this problem to do so with minimal power and processing overhead, while not relying on speaker-specific voice samples and training due to usability concerns. This paper presents the design and implementation of a custom research platform for low-power wireless earbuds based on novel, commercial, MEMS bone-conduction microphones. Such microphones can record the wearer's speech with much greater isolation, enabling personalized voice activity detection and further audio enhancement applications. Furthermore, the paper accurately evaluates a proposed low-power personalized speech detection algorithm based on bone conduction data and a recurrent neural network running on the implemented research platform. This algorithm is compared to an approach based on traditional microphone input. The performance of the bone conduction system, achieving detection of speech within 12.8ms at an accuracy of 95\% is evaluated. Different SoC choices are contrasted, with the final implementation based on the cutting-edge Ambiq Apollo 4 Blue SoC achieving 2.64mW average power consumption at 14uJ per inference, reaching 43h of battery life on a miniature 32mAh li-ion cell and without duty cycling.

What problem does this paper attempt to address?

This paper discusses the use of low-power bone conduction microphones in small true wireless earbuds to enhance audio and improve voice communication clarity. Traditional microphones often capture poor voice quality and are susceptible to environmental noise interference due to the distance from the wearer's mouth. Bone conduction microphones can isolate voice recording better and are suitable for personalized voice activity detection (pVAD) and further audio enhancement applications. The main contributions of this paper are as follows: 1. Design and implementation of a research platform for low-power wireless earbuds based on a novel commercial MEMS bone conduction microphone. 2. Development and evaluation of an ultra-low-power bone-conduction-based personalized voice activity detection algorithm, as well as exploring further energy-saving possibilities. 3. Comparison of the performance of industry-standard Nordic NRF5340 and Ambiq Apollo 4 Blue chips in ultra-low-power edge processing. Ambiq Apollo 4 Blue chip was selected, achieving an average power consumption of 2.64 milliwatts and able to run for 43 hours on a tiny 32mAh lithium-ion battery without periodic shutdown. The research also points out that although there have been works on low-power VAD (voice activity detection), these methods often fail to differentiate between the target speaker and others, making them unsuitable for speech enhancement in earbuds. Bone conduction microphones provide a new approach to distinguish the wearer's voice and, when combined with TinyML technology, can reduce processing requirements without the need for specific speaker training samples. In addition, the paper created a self-built dataset that includes bone conduction, air conduction, and external noise for model training and testing. A lightweight pVAD model was designed using a recurrent neural network with approximately 5000 parameters, suitable for running in resource-constrained environments. Experimental results demonstrate that the bone conduction system outperforms traditional microphone inputs in speech detection, with higher accuracy and robustness.

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms

A High-Performance, Low Power Research Hearing Aid featuring a High-Level Programmable Custom 22nm FDSOI SoC

A Wireless Headstage System Based on Neural-Recording Chip Featuring 315 Nw Kickback-Reduction SAR ADC

Mmear: Push the Limit of COTS Mmwave Eavesdropping on Headphones

Enabling Real-Time On-Chip Audio Super Resolution for Bone-Conduction Microphones

A Real-Time Dual-Microphone Speech Enhancement Algorithm Assisted by Bone Conduction Sensor

BC-VAD: A Robust Bone Conduction Voice Activity Detection

A Wearable Bone-Conducted Speech Enhancement System For Strong Background Noises

ClearBuds: Wireless Binaural Earbuds for Learning-Based Speech Enhancement

DSP.Ear: Leveraging Co-Processor Support for Continuous Audio Sensing on Smartphones

A Novel All Silicon Bone Conduction Microphone with Broad Bandwidth (100Hz~10khz)

A Wireless System for EEG Acquisition and Processing in an Earbud Form Factor with 600 Hours Battery Lifetime

An Ultra-Low-Noise, Low Power and Miniaturized Dual-Channel Wireless Neural Recording Microsystem

Comparison of speech processing strategies for the design of an ultra low-power analog bionic ear

An application-specific low power speech processor for cochlear implants

Speakersense: Energy Efficient Unobtrusive Speaker Identification On Mobile Phones

A Smart Binaural Hearing Aid Architecture Based on a Mobile Computing Platform

A Smart Binaural Hearing Aid Architecture Leveraging a Smartphone APP with Deep-Learning Speech Enhancement.

An Ultra-Low Power Binarized Convolutional Neural Network-Based Speech Recognition Processor with On-Chip Self-Learning.

EarSpeech: Exploring In-Ear Occlusion Effect on Earphones for Data-efficient Airborne Speech Enhancement

System Architecture of a Smart Binaural Hearing Aid Using a Mobile Computing Platform