Abstract:Recently, advancements in large language models (LLMs) have shown an unprecedented ability across various language tasks. This paper investigates the potential application of LLMs to slot filling with noisy ASR transcriptions, via both in-context learning and task-specific fine-tuning. Dedicated prompt designs and fine-tuning approaches are proposed to improve the robustness of LLMs for slot filling with noisy ASR transcriptions. Moreover, a linearised knowledge injection (LKI) scheme is also proposed to integrate dynamic external knowledge into LLMs. Experiments were performed on SLURP to quantify the performance of LLMs, including GPT-3.5-turbo, GPT-4, LLaMA-13B and Vicuna-13B (v1.1 and v1.5) with different ASR error rates. The use of the proposed fine-tuning together with the LKI scheme for LLaMA-13B achieved an 8.3% absolute SLU-F1 improvement compared to the strong Flan-T5-base baseline system on a limited data setup.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is to improve the performance of large - language models (LLMs) in handling slot - filling tasks with noisy automatic speech recognition (ASR) transcripts. Specifically, the research focuses on the following points: 1. **Evaluating LLM performance under different ASR error rates**: By using different ASR models (such as different versions of the Whisper model), the researchers evaluated the slot - filling performance of LLMs when processing ASR - transcribed audio with different error rates. 2. **Proposing improved prompt design and fine - tuning methods**: In order to improve the robustness of LLMs when handling noisy ASR transcripts, the paper proposes special prompt design and effective data - efficient fine - tuning methods. These methods aim to use external dynamic knowledge to guide the generation process of LLMs and reduce inaccurate information extraction due to ASR errors. 3. **Introducing the Linearized Knowledge Injection (LKI) scheme**: The LKI scheme allows the contextual knowledge extracted from the N - best list to be linearized into text and provided as part of the prompt to LLMs, in order to provide necessary constraints to guide language generation, especially when handling noisy ASR transcripts. 4. **Exploring slot - filling tasks with limited data sets**: The paper also explores how to improve the performance of slot - filling tasks through transfer learning and pre - trained language models under limited labeled data sets, especially how to effectively use a small amount of data to achieve the best results when fine - tuning in specific domains. In summary, the main objective of this paper is to enhance the slot - filling ability of LLMs when processing noisy ASR - transcribed audio through a series of technical means, such as improved prompt design, the LKI scheme, and the Low - Rank Adaptation (LoRA) fine - tuning method, so as to achieve more accurate and efficient natural - language understanding in practical applications.

Speech-based Slot Filling using Large Language Models

Large Language Models for Expansion of Spoken Language Understanding Systems to New Languages

Prompting Large Language Models with Speech Recognition Abilities

Supervised Knowledge Makes Large Language Models Better In-context Learners

An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants

A Survey on Speech Large Language Models

Zero-shot Slot Filling in the Age of LLMs for Dialogue Systems

Leveraging Large Language Models for Exploiting ASR Uncertainty

Effective Slot Filling via Weakly-Supervised Dual-Model Learning

Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions

Using Word Confusion Networks for Slot Filling in Spoken Language Understanding.

Large Language Models with Controllable Working Memory

Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data

Do Large Language Model Understand Multi-Intent Spoken Language ?

Linguistically-Enriched and Context-Aware Zero-shot Slot Filling

A Comprehensive Evaluation of Large Language Models on Aspect-Based Sentiment Analysis

Using Large Language Model for End-to-End Chinese ASR and NER

Large-scale Language Model Rescoring on Long-form Data

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

Large Language Models Meet NLP: A Survey

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study