Abstract:Objective: Develop a novel and highly efficient framework that decodes Inferior Colliculus (IC) neural activities for phoneme recognition. Methods: We propose using Hyperdimensional Computing (HDC) to support an efficient phoneme recognition algorithm, in contrast to widely applied Deep Neural Networks (DNN). The high-dimensional representation and operations in HDC are rooted in human brain functionalities and naturally parallelizable, showing the potential for efficient neural activity analysis. Our proposed method includes a spatial and temporal-aware HDC encoder that effectively captures global and local patterns. As part of our framework, we deploy the lightweight HDC-based algorithm on a highly customizable and flexible hardware platform, i.e., Field Programmable Gate Arrays (FPGA), for optimal algorithm speedup. To evaluate our method, we record IC neural activities on gerbils while playing the sound of different phonemes. Results: We compare our proposed method with multiple baseline machine learning algorithms in recognition quality and learning efficiency, across different hardware platforms. The results show that our method generally achieves better classification quality than the best-performing baseline. Compared to the Deep Residual Neural Network (i.e., ResNet), our method shows a speedup up to 74×, 67×, 210× on CPU, GPU, and FPGA respectively. We achieve up to 15% (10%) higher accuracy in consonant (vowel) classification than ResNet. Conclusion: By leveraging brain-inspired HDC for IC neural activity encoding and phoneme classification, we achieve orders of magnitude runtime speedup while improving accuracy in various challenging task settings. Significance: Decoding IC neural activities is an important step to enhance understanding about human auditory system. However, these responses from the central auditory system are noisy and contain high variance, demanding large-scale datasets and iterative model fine-tuning. The proposed HDC-based framework is more scalable and viable for future real-world deployment thanks to its fast training and overall better quality.

Parallel And Hierarchical Decision Making For Sparse Coding In Speech Recognition

Deep and Sparse Learning in Speech and Language Processing: An Overview

Speech Enhancement with a GSC-like Structure Employing Sparse Coding

Online Pattern Learning for Non-Negative Convolutive Sparse Coding.

Efficient Sparse Coding with the Adaptive Locally Competitive Algorithm for Speech Classification

Distributed Submodular Maximization for Large Vocabulary Continuous Speech Recognition

K-CPD: Learning of Overcomplete Dictionaries for Tensor Sparse Coding.

Learning Word Representations with Hierarchical Sparse Coding

State-Clustering Based Multiple Deep Neural Networks Modeling Approach for Speech Recognition

Hyperdimensional Brain-Inspired Learning for Phoneme Recognition With Large-Scale Inferior Colliculus Neural Activities

Construction of a compact dynamic decoder network for large vocabulary continuous speech recognition

Blockwise Coordinate Descent Schemes for Efficient and Effective Dictionary Learning.

A hybrid discriminant fuzzy DNN with enhanced modularity bat algorithm for speech recognition

Unsupervised Speaker Adaptation Of Deep Neural Network Based On The Combination Of Speaker Codes And Singular Value Decomposition For Speech Recognition

Heterogeneous Convolutive Non-Negative Sparse Coding

Speech overlap detection and attribution using convolutive non-negative sparse coding

A Cluster-Based Multiple Deep Neural Networks Method for Large Vocabulary Continuous Speech Recognition

Building DNN acoustic models for large vocabulary speech recognition

Hidden Markov Acoustic Modeling with Bootstrap and Restructuring for Low-Resourced Languages

Study on Hierarchical Speech Recognition

Joint sparse representation based cepstral-domain dereverberation for distant-talking speech recognition