Abstract:In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable. An ideal agent would interact by initiating different response types according to the available knowledge sources. However, most current studies either fail to or artificially incorporate such agent-side initiative. This work presents InSCIt, a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations where the agent searches over Wikipedia and either directly answers, asks for clarification, or provides relevant information to address user queries. The data supports two subtasks, evidence passage identification and response generation, as well as a human evaluation protocol to assess model performance. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering. Both systems significantly underperform humans, suggesting ample room for improvement in future studies.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is the limitations of existing dialogue systems in handling information - seeking dialogues. Specifically, most of the current research either fails to or artificially incorporates the initiative of the agent side into the dialogue. This means that existing dialogue systems usually simply respond to users' questions and cannot adopt different response strategies according to available knowledge sources, such as giving direct answers, requesting clarification, or providing relevant but incomplete information to meet users' query requirements. Such limitations restrict the flexibility and practicality of dialogue systems, especially when dealing with cases where the questions raised by users are not clear enough or cannot be directly answered. To solve these problems, this paper introduces the INSCI T dataset, which is a dataset of information - seeking dialogues containing mixed - initiative interactions. INSCI T aims to support researchers in developing dialogue systems that can better understand users' intentions and adopt multiple response strategies. These strategies include but are not limited to: - **Direct answer**: When there is enough information to fully answer the user's question. - **Request clarification**: When the user's question is not clear enough or is ambiguous, the agent needs to ask further to obtain more details. - **Provide relevant but incomplete information**: When there is not enough information to directly answer the user's question, but part of the relevant information can be provided to partially meet the user's needs. By introducing such a dataset, researchers can train and evaluate dialogue systems to make them more intelligent and flexible when facing complex and changeable user queries. In addition, INSCI T also supports two subtasks: evidence paragraph identification and response generation, as well as a human evaluation protocol for a system to evaluate model performance. This helps to promote the development of dialogue system technology and make it closer to human communication methods.

INSCIT: Information-Seeking Conversations with Mixed-Initiative Interactions

Uman-in-thel oop

Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity

An In-depth Investigation of User Response Simulation for Conversational Search.

A Large-Scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search

Why and When: Understanding System Initiative during Conversational Collaborative Search

Conversational Search with Mixed-Initiative - Asking Good Clarification Questions backed-up by Passage Retrieval

Benchmarking Large Language Models for Conversational Question Answering in Multi-instructional Documents

Designing the Conversational Agent: Asking Follow-up Questions for Information Elicitation

Leveraging User Simulation to Develop and Evaluate Conversational Information Access Agents

The Efficiency of Question-Asking Strategies in a Real-World Visual Search Task

Towards Better Understanding of User Satisfaction in Open-Domain Conversational Search

IDAT: A Multi-Modal Dataset and Toolkit for Building and Evaluating Interactive Task-Solving Agents

ChatShop: Interactive Information Seeking with Language Agents

TopiOCQA: Open-domain Conversational Question Answering with Topic Switching

Beyond Query: Interactive User Intention Understanding

ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Augmenting Ad-Hoc IR Dataset for Interactive Conversational Search