Toward a design theory for virtual companionship
Timo Strohmann,Dominik Siemon,Bijan Khosrawi-Rad,Susanne Robra-Bissantz
DOI: https://doi.org/10.1080/07370024.2022.2084620
2022-07-20
Abstract:Due to significant technological advances in the field of artificial intelligence (AI), which are driven by increased computing power, the ubiquitous availability of data, as well as new algorithms, new forms of intelligent systems and services have been developed and brought to the market (Choudhury et al., 2020 ; Clark et al., 2019a ; Diederich et al., 2022 ; Kaplan & Haenlein, 2019 ; Ransbotham et al., 2018 ; Robert et al., 2020 ; Rzepka & Berger, 2018 ). In addition to specific applications in the form of virtual assistants, such as Apple's Siri or Amazon's Alexa , companies increasingly develop chatbots and enterprise bots to interact with customers (Diederich et al., 2022 ; Maedche et al., 2016 ; McTear et al., 2016 ). What all these systems have in common, is that they allow their users to interact with them using natural language, which is why the systems are summarized by the term conversational agent (CA) (Diederich et al., 2022 ; McTear et al., 2016 ). There are already various use cases for CAs today, ranging from executing smartphone functions, such as creating calendar entries or sending messages to smart home control, to interaction in the healthcare context (Ahmad et al., 2022 ; Elshan et al., 2022 ; Gnewuch et al., 2017 ; McTear et al., 2016 ; Sin & Munteanu, 2020 ). Thus, CAs currently offer a new way of interacting with information technology (Morana et al., 2017 ). Recent literature reviews show a growing interest in CAs and AI-enabled systems (Diederich et al., 2022 ; Elshan et al., 2022 ; Nißen et al., 2021 ; Rzepka & Berger, 2018 ), but mainly a limited variety of application contexts, which mostly focus on short-term interactions in marketing, sales, and support. Application scenarios that require long-term interaction are available but under-researched (Diederich et al., 2022 ; Elshan et al., 2022 ). Additionally, the current applications show that CA's main goal is to provide personal assistant functionality, while less attention goes to the actual interaction with the system which should be improved by social behaviors being incorporated (Elshan et al., 2022 ; Gnewuch et al., 2017 ; Krämer et al., 2011 ; Nißen et al., 2021 ; Rzepka & Berger, 2018 ). Most of these interactions are initiated by the user and not by the CA, which means that the CA acts reactively rather than proactively. Moreover, these interactions are isolated, transactional, and based on predefined paths, as if they are starting over every time (Seymour et al., 2018 ). Although presently, from a technological perspective, CAs can predominantly conduct restricted conversations related to a specific topic (Diederich et al., 2022 ), modern language prediction models such as the Generative Pre-trained Transformer 3 (GPT-3) are able to fundamentally expand the capabilities of CAs. They achieve this by enabling open-topic and richer conversations with strong interpersonal character (Brown et al., 2020 ). The GPT-3 and many other recent language models are built on Transformer (Vaswani et al., 2017 ), a neural network architecture invented by Google Research in 2017. Google's recent language model LaMDA shows how human-like ways of interacting can be achieved by enabling open-topic conversations based on the modern autoregressive language model (a Transformer-based deep learning neural network). Google argues that although conversations "tend to revolve around a specific topic, their open-ended nature means they can start in one place and end up somewhere completely different." 1 Especially when a person is talking to a friend instead of an assistant, the open-ended nature of conversat -Abstract Truncated-