Abstract:Feature extraction is an essential part of automatic speech recognition (ASR) to compress raw speech data and enhance features, where conventional implementation methods based on the digital domain have encountered energy consumption and processing speed bottlenecks. Thus, we propose a Mixed-Signal Processing (MSP) architecture to efficiently extract Mel-Frequency Cepstrum Coefficients (MFCC) features. We design MSP-MFCC to pre-process speech signals in the analog domain, which significantly reduces the cost of the analog-to-digital converter (ADC), as well as the computational complexity of the digital backend. Moreover, MSP-MFCC eliminates the time-consuming Fourier transform in the conventional digital realization by improving processing flow. We fabricated the analog part based on 180nm CMOS mixed-signal technology, then measured the chip. The measured results show the energy consumption of MSP-MFCC is 0.72 mu J/frame, and the processing speed is up to 45.79 mu s/frame. MSP-MFCC achieves 95% energy saving and about 6.4 x speedup than state of the art. Further, by using the features extracted by MSP-MFCC, speech recognition simulation reaches the accuracy of 98.2%, which also keeps the leading performance to its current counterparts. The proposed MFCC extractor is competitive for integration in the ultra-low-power always-on wearable speech recognition applications.

Mahalanobis Distance Calculation Module for Speech Recognition

An embedded speech recognition system solution based on acceleration modules

Design of a GMM Vector Multiplier Based on Two-dimensional Systolic Array

Real-time Speaker Recognition System for PDA

Moderate Vocabulary English Speech Recognition System Embedded on a Chip

Design of Speech Recognition Co-Processor for the Embedded Implementation

Real-Time Speech Recognition Method for Embedded System

A Master-Slave SoC Structure for HMM Based Speech Recognition.

An Efficient Computation Algorithm In Mandarin Continuous Speech Recognition

Design of a configurable output probability calculation coprocessor for speech recognition

English Speech Recognition System on Chip

An Ultra-Low Power Binarized Convolutional Neural Network-Based Speech Recognition Processor with On-Chip Self-Learning.

High Performance Mandarin Digit Recognition System on a DSP Chip

Multi-Pass Decoding Algorithm Based on a Speech Recognition Chip

MSP-MFCC: Energy-Efficient MFCC Feature Extraction Method With Mixed-Signal Processing Architecture for Wearable Speech Recognition Applications

Efficiency improvements for a speech recognition coprocessor

A Novel Efficient Decoding Algorithm for CDHMM-based Speech Recognizer on Chip.

High Performance Digit Mandarin Speech Recognition

A System for Mandarin Short Phrase Recognition on Portable Devices

Efficient Embedded Speech Recognition for Very Large Vocabulary Mandarin Car-Navigation Systems

A Fully Integrated 1.7mw Attention-Based Automatic Speech Recognition Processor