Contextual Spelling Correction with Large Language Models

Nikhil Siddhartha,Xavier Velez,Weiran Wang,Angad Chandorkar,Zelin Wu,Kandarp Joshi,D. Caseiro,G. Pundak,Gan Song,Ben Haynor,Pat Rondon,K. Sim
DOI: https://doi.org/10.1109/ASRU57964.2023.10389637
2023-12-16
Abstract:Contextual Spelling Correction (CSC) models are used to improve automatic speech recognition (ASR) quality given userspecific context. Typically, context is modeled as a large set of text spans to compare against a given ASR hypothesis using some distance measure (text, phonetic, or neural embedding). In this work we propose a CSC system based on a single Large Language Model (LLM) adapted with prompt tuning. Our approach is shown to be data efficient, and does not require dedicated serving. Our system exhibits advanced contextualization capabilities, such as support for phonetic spellings, cross-lingual scripts, and context specified as topics, with little to no data engineering. On voice assistant datasets, our system achieves $7.8 \%$ absolute word error rate reduction from a reference ASR system with relevant context and improving upon other contextualization solutions. Finally, we test our system in a prompt-injection attack scenario and report vulnerabilities and mitigations.
Computer Science
What problem does this paper attempt to address?