Take It, Leave It, or Fix It: Measuring Productivity and Trust in Human-AI Collaboration

Crystal Qian,James Wexler
DOI: https://doi.org/10.1145/3640543.3645198
2024-04-02
Abstract:Although recent developments in generative AI have greatly enhanced the capabilities of conversational agents such as Google's Gemini (formerly Bard) or OpenAI's ChatGPT, it's unclear whether the usage of these agents aids users across various contexts. To better understand how access to conversational AI affects productivity and trust, we conducted a mixed-methods, task-based user study, observing 76 software engineers (N=76) as they completed a programming exam with and without access to Bard. Effects on performance, efficiency, satisfaction, and trust vary depending on user expertise, question type (open-ended "solve" vs. definitive "search" questions), and measurement type (demonstrated vs. self-reported). Our findings include evidence of automation complacency, increased reliance on the AI over the course of the task, and increased performance for novices on "solve"-type questions when using the AI. We discuss common behaviors, design recommendations, and impact considerations to improve collaborations with conversational AI.
Human-Computer Interaction
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the impact on productivity and trust during the software development process when using conversational AI (such as Google's Bard). Specifically, the researchers hope to observe, through a mixed - method task - based user study, how the performance, efficiency, satisfaction, and trust of 76 software engineers change when they complete a programming exam with and without access to Bard. The research mainly revolves around two research questions: 1. **RQ1: Impact on Productivity** - How does using conversational AI affect productivity? 2. **RQ2: Trust - related Behaviors** - How do users demonstrate their trust in conversational AI? Through these questions, the researchers hope to understand the behavioral differences among different user groups (classified according to experience levels) when using conversational AI, and how these differences affect their productivity and trust in AI systems. In addition, the study also explores the side effects brought about by automation, such as increased automation dependence, automation complacency, etc., and proposes design suggestions to improve the application of conversational AI in human - machine collaboration.