Abstract:This paper introduces HAAQI-Net, a non-intrusive deep learning model for music audio quality assessment tailored for hearing aid users. Unlike traditional methods like the Hearing Aid Audio Quality Index (HAAQI), which rely on intrusive comparisons to a reference signal, HAAQI-Net offers a more accessible and efficient alternative. Using a bidirectional Long Short-Term Memory (BLSTM) architecture with attention mechanisms and features from the pre-trained BEATs model, HAAQI-Net predicts HAAQI scores directly from music audio clips and hearing loss patterns. Results show HAAQI-Net's effectiveness, with predicted scores achieving a Linear Correlation Coefficient (LCC) of 0.9368, a Spearman's Rank Correlation Coefficient (SRCC) of 0.9486, and a Mean Squared Error (MSE) of 0.0064, reducing inference time from 62.52 seconds to 2.54 seconds. Although effective, feature extraction via the large BEATs model incurs computational overhead. To address this, a knowledge distillation strategy creates a student distillBEATs model, distilling information from the teacher BEATs model during HAAQI-Net training, reducing required parameters. The distilled HAAQI-Net maintains strong performance with an LCC of 0.9071, an SRCC of 0.9307, and an MSE of 0.0091, while reducing parameters by 75.85% and inference time by 96.46%. This reduction enhances HAAQI-Net's efficiency and scalability, making it viable for real-world music audio quality assessment in hearing aid settings. This work also opens avenues for further research into optimizing deep learning models for specific applications, contributing to audio signal processing and quality assessment by providing insights into developing efficient and accurate models for practical applications in hearing aid technology.

HILCodec: High-Fidelity and Lightweight Neural Audio Codec

A High Fidelity and Low Complexity Neural Audio Coding

AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec

Psychoacoustic Calibration of Loss Functions for Efficient End-to-End Neural Audio Coding

SoundStream: An End-to-End Neural Audio Codec

MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios

APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

A Low-Bitrate Neural Audio Codec Framework with Bandwidth Reduction and Recovery for High-Sampling-Rate Waveforms

Code Drift: Towards Idempotent Neural Audio Codecs

FunCodec: A Fundamental, Reproducible and Integrable Open-source Toolkit for Neural Speech Codec

Source-Aware Neural Speech Coding for Noisy Speech Compression

SNAC: Multi-Scale Neural Audio Codec

Neural Audio Coding with Deep Complex Networks

Enhanced Standard Compatible Image Compression Framework based on Auxiliary Codec Networks

PSCodec: A Series of High-Fidelity Low-bitrate Neural Speech Codecs Leveraging Prompt Encoders

SuperCodec: A Neural Speech Codec with Selective Back-Projection Network

HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

Optimizing Neural Speech Codec for Low-Bitrate Compression via Multi-Scale Encoding

Learning Source Disentanglement in Neural Audio Codec

LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models