AcademiaOS: Automating Grounded Theory Development in Qualitative Research with Large Language Models

Thomas Übellacker
2024-03-13
Abstract:AcademiaOS is a first attempt to automate grounded theory development in qualitative research with large language models. Using recent large language models' language understanding, generation, and reasoning capabilities, AcademiaOS codes curated qualitative raw data such as interview transcripts and develops themes and dimensions to further develop a grounded theoretical model, affording novel insights. A user study (n=19) suggests that the system finds acceptance in the academic community and exhibits the potential to augment humans in qualitative research. AcademiaOS has been made open-source for others to build upon and adapt to their use cases.
Human-Computer Interaction,Artificial Intelligence,Information Retrieval
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to design and implement a basic open - source platform to automate the development process of grounded theory using large language models (LLMs). Specifically, the paper explores how to automate or enhance the tasks of coding, dimension aggregation, and theory development in qualitative research through large language models, thereby improving research efficiency and quality. In the paper, AcademiaOS is proposed and implemented. It is an open - source platform aiming to automate or assist tasks in the development of grounded theory, such as data coding, dimension aggregation, and theory construction. AcademiaOS provides the scientific community with a new qualitative research method. This method is transparent, easy to access, and scalable (through its open - source feature), and can expand the scope of evidence through the cost - effectiveness of parallel analysis of multiple qualitative data sources. This system has far - reaching impacts in the social sciences, especially in the field of organizational theory, and is also applicable to other related disciplines. The research question of the paper is: "How to effectively design and implement a basic open - source platform to automate the development of grounded theory using large language models?" Through this question, the author explores how to use modern technological means, especially large language models, to improve and accelerate the process of qualitative research.