RACER: An LLM-powered Methodology for Scalable Analysis of Semi-structured Mental Health Interviews

Satpreet Harcharan Singh,Kevin Jiang,Kanchan Bhasin,Ashutosh Sabharwal,Nidal Moukaddam,Ankit B Patel
2024-02-05
Abstract:Semi-structured interviews (SSIs) are a commonly employed data-collection method in healthcare research, offering in-depth qualitative insights into subject experiences. Despite their value, the manual analysis of SSIs is notoriously time-consuming and labor-intensive, in part due to the difficulty of extracting and categorizing emotional responses, and challenges in scaling human evaluation for large populations. In this study, we develop RACER, a Large Language Model (LLM) based expert-guided automated pipeline that efficiently converts raw interview transcripts into insightful domain-relevant themes and sub-themes. We used RACER to analyze SSIs conducted with 93 healthcare professionals and trainees to assess the broad personal and professional mental health impacts of the COVID-19 crisis. RACER achieves moderately high agreement with two human evaluators (72%), which approaches the human inter-rater agreement (77%). Interestingly, LLMs and humans struggle with similar content involving nuanced emotional, ambivalent/dialectical, and psychological statements. Our study highlights the opportunities and challenges in using LLMs to improve research efficiency and opens new avenues for scalable analysis of SSIs in healthcare research.
Computation and Language,Quantitative Methods
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the manual analysis problem of Semi-Structured Interviews (SSIs) in healthcare research. Specifically, while SSIs can provide in-depth qualitative insights, manually analyzing these interview data is very time-consuming and labor-intensive. The challenges of extracting and categorizing emotional responses and the large-scale assessment of human evaluations make this process particularly difficult. To solve these problems, the authors developed RACER (an expert-guided automated pipeline based on Large Language Models (LLM)), which can efficiently convert raw interview transcripts into meaningful themes and sub-themes. Using RACER, the authors analyzed SSIs of 93 healthcare professionals and trainees to assess the impact of the COVID-19 crisis on their personal and professional mental health. ### Main Contributions 1. **Increased Efficiency**: RACER significantly improves the efficiency of SSIs analysis by automating the processing of large amounts of interview data. 2. **High Consistency**: RACER achieved a consistency of 72% with two human evaluators, close to the 77% consistency between human evaluators. 3. **Handling Complex Emotions**: Although LLM and humans face similar challenges in dealing with complex emotions, contradictory/dialectical statements, and psychological states, RACER demonstrated potential and limitations in these areas. 4. **Extended Applications**: RACER opens new avenues for large-scale analysis of SSIs in healthcare research. ### Research Background - **Semi-Structured Interviews (SSIs)**: Widely used in healthcare research, providing in-depth qualitative insights, but manual analysis is time-consuming and resource-intensive. - **Large Language Models (LLM)**: Models like GPT-4 offer new methods for extracting and interpreting data from text corpora. - **COVID-19 Crisis**: Brought significant personal and professional challenges to healthcare workers, including fear of infecting family members, grief over patient deaths, and moral dilemmas in resource allocation. ### Method - **RACER Pipeline**: 1. **Retrieve**: Use LLM to extract relevant responses from interview transcripts. 2. **Aggregate**: Summarize responses from all interviewees. 3. **Cluster with Expert Guidance**: Cluster responses into themes and sub-themes with expert guidance. 4. **Re-cluster**: Run the clustering process multiple times, determining the final clustering results through majority voting. ### Results - **Emotional and Psychological Impact**: Most respondents reported negative emotions such as anxiety, stress, sadness, or anger, but some expressed positive emotions like gratitude. - **Support and Coping Strategies**: Most respondents felt support from colleagues and family, with family dynamics also affected. - **Work Impact**: Most healthcare workers experienced increased working hours and changes in patient management methods. - **Future Outlook**: Some respondents were optimistic about the future, hoping to learn new opportunities and growth from the crisis; others were concerned about long-term personal and professional impacts. ### Discussion - **Advantages**: RACER significantly improves the efficiency and scalability of SSIs analysis. - **Limitations**: Both RACER and human evaluators face similar challenges in handling complex emotions and psychological states, highlighting the indispensable role of human expertise in reviewing and interpreting LLM outputs. In conclusion, this paper demonstrates the potential of RACER in analyzing SSIs in healthcare research, while also pointing out current limitations and future research directions.