Abstract:Objectives: Visual and contextual cues facilitate speech recognition in suboptimal listening conditions (e.g., background noise, hearing loss, hearing aid signal processing). Moreover, successful speech recognition in challenging listening conditions is linked to cognitive abilities such as working memory and fluid intelligence. However, it is unclear which cognitive abilities facilitate the use of visual and contextual cues in individuals with normal hearing and hearing aid users. The first aim was to investigate whether individuals with hearing aid users rely on visual and contextual cues to a higher degree than individuals with normal hearing in a speech-in-noise recognition task. The second aim was to investigate whether working memory and fluid intelligence are associated with the use of visual and contextual cues in these groups. Design: Groups of participants with normal hearing and hearing aid users with bilateral, symmetrical mild to severe sensorineural hearing loss were included (n = 169 per group). The Samuelsson and Rönnberg task was administered to measure speech recognition in speech-shaped noise. The task consists of an equal number of sentences administered in the auditory and audiovisual modalities, as well as without and with contextual cues (visually presented word preceding the sentence, e.g.,: "Restaurant"). The signal to noise ratio was individually set to 1 dB below the level obtained for 50% correct speech recognition in the hearing-in-noise test administered in the auditory modality. The Reading Span test was used to measure working memory capacity and the Raven test was used to measure fluid intelligence. The data were analyzed using linear mixed-effects modeling. Results: Both groups exhibited significantly higher speech recognition performance when visual and contextual cues were available. Although the hearing aid users performed significantly worse compared to those with normal hearing in the auditory modality, both groups reached similar performance levels in the audiovisual modality. In addition, a significant positive relationship was found between the Raven test score and speech recognition performance only for the hearing aid users in the audiovisual modality. There was no significant relationship between Reading Span test score and performance. Conclusions: Both participants with normal hearing and hearing aid users benefitted from contextual cues, regardless of cognitive abilities. The hearing aid users relied on visual cues to compensate for the perceptual difficulties, reaching a similar performance level as the participants with normal hearing when visual cues were available, despite worse performance in the auditory modality. It is important to note that the hearing aid users who had higher fluid intelligence were able to capitalize on visual cues more successfully than those with poorer fluid intelligence, resulting in better speech-in-noise recognition performance.

EXPRESS: Prior multisensory learning can facilitate auditory-only voice-identity and speech recognition in noise

Adaptive Temporal Encoding Leads to a Background-Insensitive Cortical Representation of Speech

Implicit multisensory associations influence voice recognition

Multisensory benefits for speech recognition in noisy environments

Visual Facial Enhancements Can Significantly Improve Speech Perception in the Presence of Noise

Prediction and constraint in audiovisual speech perception

Multimodal Input Aids a Bayesian Model of Phonetic Learning

Enhancing speech perception in noise through articulation

The effects of temporal cues, point-light displays, and faces on speech identification and listening effort

Voice-Associated Static Face Image Releases Speech from Informational Masking

Spatial alignment between faces and voices improves selective attention to audio-visual speech

Functional Connectivity between Face-Movement and Speech-Intelligibility Areas during Auditory-Only Speech Perception

Exploring the role of singing, semantics, and amusia screening in speech-in-noise perception in musicians and non-musicians

Neural Mechanisms Underlying Cross-Modal Phonetic Encoding

Familiarity Is Key: Exploring the Effect of Familiarity on the Face-Voice Correlation

Relationships Between Hearing Status, Cognitive Abilities, and Reliance on Visual and Contextual Cues

Speech-derived haptic stimulation enhances speech recognition in a multi-talker background

Decreasing hearing ability does not lead to improved visual speech extraction as revealed in a neural speech tracking paradigm

Reassessing the Benefits of Audiovisual Integration to Speech Perception and Intelligibility

Rapid and long-lasting improvements in neural discrimination of acoustic signals with passive familiarization

The Effect on Speech-in-Noise Perception of Real Faces and Synthetic Faces Generated with either Deep Neural Networks or the Facial Action Coding System