RSVP: Customer Intent Detection via Agent Response Contrastive and Generative Pre-Training

Yu-Chien Tang,Wei-Yao Wang,An-Zi Yen,Wen-Chih Peng
2023-10-15
Abstract:The dialogue systems in customer services have been developed with neural models to provide users with precise answers and round-the-clock support in task-oriented conversations by detecting customer intents based on their utterances. Existing intent detection approaches have highly relied on adaptively pre-training language models with large-scale datasets, yet the predominant cost of data collection may hinder their superiority. In addition, they neglect the information within the conversational responses of the agents, which have a lower collection cost, but are significant to customer intent as agents must tailor their replies based on the customers' intent. In this paper, we propose RSVP, a self-supervised framework dedicated to task-oriented dialogues, which utilizes agent responses for pre-training in a two-stage manner. Specifically, we introduce two pre-training tasks to incorporate the relations of utterance-response pairs: 1) Response Retrieval by selecting a correct response from a batch of candidates, and 2) Response Generation by mimicking agents to generate the response to a given utterance. Our benchmark results for two real-world customer service datasets show that RSVP significantly outperforms the state-of-the-art baselines by 4.95% for accuracy, 3.4% for MRR@3, and 2.75% for MRR@5 on average. Extensive case studies are investigated to show the validity of incorporating agent responses into the pre-training stage.
Computation and Language
What problem does this paper attempt to address?
The paper aims to address the issue of customer intent detection in task-oriented dialogue systems, particularly in customer service scenarios. Specifically, the paper proposes a new framework called RSVP (Request and Service Via Pre-training), which improves the understanding and recognition of customer intent by leveraging the response information from customer service agents. Traditional methods for handling customer intent detection typically rely on large-scale, high-quality annotated data for pre-training, which is not only costly but also time-consuming. Moreover, most studies focus solely on the customer's utterance, neglecting the value of the customer service agent's response. The paper points out that the agent's response contains important clues about the customer's intent, and these response data are more easily accessible and do not require additional annotation. Therefore, the goal of the RSVP framework is to fully utilize the agent's responses through the following two stages: 1. **Pre-training Stage**: - **Response Retrieval**: Select the correct agent response from a set of candidate responses to enhance the model's ability to distinguish between correct and incorrect responses. - **Response Generation**: Mimic the agent in generating responses, improving the model's understanding of agent responses by directly learning how to respond to customer utterances. 2. **Fine-tuning Stage**: Apply the pre-trained model to specific intent detection tasks to further optimize model performance. In this way, RSVP not only reduces the dependency on external large annotated datasets but also effectively utilizes the metadata within the internal customer service dialogue system (i.e., the agent's responses), thereby improving the accuracy of customer intent detection. Experimental results show that RSVP significantly outperforms existing baseline methods on multiple real-world datasets.