Abstract:In the current era of artificial intelligence, large language models such as ChatGPT and BARD are being increasingly used for various applications, such as language translation, text generation, and human-like conversation. The fact that these models consist of large amounts of data, including many different opinions and perspectives, could introduce the possibility of a new qualitative research approach: Due to the probabilistic character of their answers, “interviewing” these large language models could give insights into public opinions in a way that otherwise only interviews with large groups of subjects could deliver. However, it is not yet clear if qualitative content analysis research methods can be applied to interviews with these models. Evaluating the applicability of qualitative research methods to interviews with large language models could foster our understanding of their abilities and limitations. In this paper, we examine the applicability of qualitative content analysis research methods to interviews with ChatGPT in English, ChatGPT in German, and BARD in English on the relevance of computer science in K-12 education, which was used as an exemplary topic. We found that the answers produced by these models strongly depended on the provided context, and the same model could produce heavily differing results for the same questions. From these results and the insights throughout the process, we formulated guidelines for conducting and analyzing interviews with large language models. Our findings suggest that qualitative content analysis research methods can indeed be applied to interviews with large language models, but with careful consideration of contextual factors that may affect the responses produced by these models. The guidelines we provide can aid researchers and practitioners in conducting more nuanced and insightful interviews with large language models. From an overall view of our results, we generally do not recommend using interviews with large language models for research purposes, due to their highly unpredictable results. However, we suggest using these models as exploration tools for gaining different perspectives on research topics and for testing interview guidelines before conducting real-world interviews.

From Text Attribution to Data Extraction: Applications of Big Language Models in Historical Science

If the Sources Could Talk: Evaluating Large Language Models for Research Assistance in History

Exploring Large Language Models for Classical Philology

Exploring the Application Potential of the Large Language Model in Sociological Research: A Case Study of ChatGPT

From ChatGPT, DALL-E 3 to Sora: How has Generative AI Changed Digital Humanities Research and Services?

Large Language Models and Generative AI, Oh My!

How Can Generative Artificial Intelligence Techniques Facilitate Intelligent Research into Ancient Books?

Exploring the potential of large language models and generative artificial intelligence (GPT): Applications in Library and Information Science

Lost in Translation: Large Language Models in Non-English Content Analysis

Qualitative Research Methods for Large Language Models: Conducting Semi-Structured Interviews with ChatGPT and BARD on Computer Science Education

Large Language Models for Cultural Heritage

Algorithmic Ghost in the Research Shell: Large Language Models and Academic Knowledge Creation in Management Research

Perils and opportunities in using large language models in psychological research

Enhancing qualitative research in psychology with large language models: a methodological exploration and examples of simulations

An Interdisciplinary Outlook on Large Language Models for Scientific Research

Scientific Computing with Large Language Models

Using large language models in psychology

AI for Biomedicine in the Era of Large Language Models

Large Language Models on Graphs: A Comprehensive Survey

Large Language Models as Data Preprocessors

On the application of Large Language Models for language teaching and assessment technology