Blending citizen science with natural language processing and machine learning: Understanding the experience of living with multiple sclerosis

Christina Haag,Nina Steinemann,Deborah Chiavi,Christian P. Kamm,Chloé Sieber,Zina-Mary Manjaly,Gábor Horváth,Vladeta Ajdacic-Gross,Milo Alan Puhan,Viktor von Wyl
DOI: https://doi.org/10.1371/journal.pdig.0000305
2023-08-03
PLOS Digital Health
Abstract:The emergence of new digital technologies has enabled a new way of doing research, including active collaboration with the public ('citizen science'). Innovation in machine learning (ML) and natural language processing (NLP) has made automatic analysis of large-scale text data accessible to study individual perspectives in a convenient and efficient fashion. Here we blend citizen science with innovation in NLP and ML to examine (1) which categories of life events persons with multiple sclerosis (MS) perceived as central for their MS; and (2) associated emotions. We subsequently relate our results to standardized individual-level measures. Participants (n = 1039) took part in the 'My Life with MS' study of the Swiss MS Registry which involved telling their story through self-selected life events using text descriptions and a semi-structured questionnaire. We performed topic modeling ('latent Dirichlet allocation') to identify high-level topics underlying the text descriptions. Using a pre-trained language model, we performed a fine-grained emotion analysis of the text descriptions. A topic modeling analysis of totally 4293 descriptions revealed eight underlying topics. Five topics are common in clinical research: 'diagnosis', 'medication/treatment', 'relapse/child', 'rehabilitation/wheelchair', and 'injection/symptoms'. However, three topics, 'work', 'birth/health', and 'partnership/MS' represent domains that are of great relevance for participants but are generally understudied in MS research. While emotions were predominantly negative (sadness, anxiety), emotions linked to the topics 'birth/health' and 'partnership/MS' was also positive (joy). Designed in close collaboration with persons with MS, the 'My Life with MS' project explores the experience of living with the chronic disease of MS using NLP and ML. Our study thus contributes to the body of research demonstrating the potential of integrating citizen science with ML-driven NLP methods to explore the experience of living with a chronic condition. Text data is a powerful way to allow individuals to freely express their experiences in their own words. This is especially powerful when conducting large-scale studies involving and collaborating with the public ('citizen science'). Historically, text analysis has been time-consuming, but innovations in natural language processing and machine learning, both sub-disciplines of artificial intelligence, have made analysis convenient and efficient. We show how this can be implemented using the example of the Swiss Multiple Sclerosis (MS) Registry, a longitudinal patient-centered project. Persons with MS were asked to tell the story of their life with MS in their own words, guided by open-ended questions. We analyzed which high-level topics of life events participants perceived as central and which emotions were reflected in their text entries. Our findings revealed eight distinct topics: 'diagnosis', 'medication/treatment', 'rehabilitation/wheelchair', 'injection/symptoms', 'relapses/children', 'work', 'birth/health', and 'partnership/MS'. Emotions linked to most text entries was predominantly negative (feelings of sadness, anxiety), with the exception of the topics 'Birth/Health' and 'Partnership/MS', where the associated emotions were also positive (feelings of joy). Using the Swiss MS Registry as a flagship project, our research attests to research investigating how innovations in NLP and ML can facilitate en-par collaboration with citizens.
What problem does this paper attempt to address?