Are Language Features Associated with Psychosis Risk Universal? A Study in Mandarin-speaking Youths at Clinical High Risk for Psychosis
Carla Agurto,Raquel Norel,Bo Wen,Yanyan Wei,Dan Zhang,Zarina Bilgrami,Xiaolu Hsi,Tianhong Zhang,Ofer Pasternak,Huijun Li,Matcheri Keshavan,Larry J. Seidman,Susan Whitfield-Gabrieli,Martha E. Shenton,Margaret A. Niznikiewicz,Jijun Wang,Guillermo Cecchi,Cheryl Corcoran,William S. Stone
DOI: https://doi.org/10.1002/wps.21045
2023-01-01
World Psychiatry
Abstract:World PsychiatryVolume 22, Issue 1 p. 157-158 Letter to the EditorFree Access Are language features associated with psychosis risk universal? A study in Mandarin-speaking youths at clinical high risk for psychosis Carla Agurto, Carla Agurto IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorRaquel Norel, Raquel Norel IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorBo Wen, Bo Wen IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorYanyan Wei, Yanyan Wei Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorDan Zhang, Dan Zhang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorZarina Bilgrami, Zarina Bilgrami Department of Psychology, Emory University, Atlanta, GA, USASearch for more papers by this authorXiaolu Hsi, Xiaolu Hsi Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorTianhong Zhang, Tianhong Zhang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorOfer Pasternak, Ofer Pasternak Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USASearch for more papers by this authorHuijun Li, Huijun Li Department of Psychology, Florida A&M University, Tallahassee, FL, USASearch for more papers by this authorMatcheri Keshavan, Matcheri Keshavan Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorLarry J. Seidman, Larry J. Seidman Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorSusan Whitfield-Gabrieli, Susan Whitfield-Gabrieli Department of Psychology, Northeastern University, Boston, MA, USASearch for more papers by this authorMartha E. Shenton, Martha E. Shenton Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USASearch for more papers by this authorMargaret A. Niznikiewicz, Margaret A. Niznikiewicz Department of Psychiatry, VA Boston Healthcare System, Brockton, MA, USASearch for more papers by this authorJijun Wang, Jijun Wang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorGuillermo Cecchi, Guillermo Cecchi IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorCheryl Corcoran, Cheryl Corcoran Icahn School of Medicine at Mount Sinai, New York, NY, USASearch for more papers by this authorWilliam S. Stone, William S. Stone Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this author Carla Agurto, Carla Agurto IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorRaquel Norel, Raquel Norel IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorBo Wen, Bo Wen IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorYanyan Wei, Yanyan Wei Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorDan Zhang, Dan Zhang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorZarina Bilgrami, Zarina Bilgrami Department of Psychology, Emory University, Atlanta, GA, USASearch for more papers by this authorXiaolu Hsi, Xiaolu Hsi Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorTianhong Zhang, Tianhong Zhang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorOfer Pasternak, Ofer Pasternak Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USASearch for more papers by this authorHuijun Li, Huijun Li Department of Psychology, Florida A&M University, Tallahassee, FL, USASearch for more papers by this authorMatcheri Keshavan, Matcheri Keshavan Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorLarry J. Seidman, Larry J. Seidman Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this authorSusan Whitfield-Gabrieli, Susan Whitfield-Gabrieli Department of Psychology, Northeastern University, Boston, MA, USASearch for more papers by this authorMartha E. Shenton, Martha E. Shenton Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA Department of Radiology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USASearch for more papers by this authorMargaret A. Niznikiewicz, Margaret A. Niznikiewicz Department of Psychiatry, VA Boston Healthcare System, Brockton, MA, USASearch for more papers by this authorJijun Wang, Jijun Wang Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, ChinaSearch for more papers by this authorGuillermo Cecchi, Guillermo Cecchi IBM T.J. Watson Research Center, Yorktown Heights, NY, USASearch for more papers by this authorCheryl Corcoran, Cheryl Corcoran Icahn School of Medicine at Mount Sinai, New York, NY, USASearch for more papers by this authorWilliam S. Stone, William S. Stone Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USASearch for more papers by this author First published: 14 January 2023 https://doi.org/10.1002/wps.21045 The authors would like to thank the study participants at the Shanghai Mental Health Center. C. Agurto and R. Norel contributed equally to the work. Supplementary information on this study can be found at https://wenboown.github.io/SHARP-NLP-Letter-to-WP/. AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Natural language processing (NLP) analyses have shown decreased coherence (tangentiality, derailment) and complexity (poverty of content) in schizophrenia and in clinical high risk (CHR) states for psychosis. We reported previously in this journal1 that an NLP machine learning classifier, which included measures of coherence and complexity, predicted psychosis onset in two independent English-speaking CHR samples. Moreover, reduced complexity has been associated with increased pauses and negative symptoms in at-risk youths2. Multiple recent NLP studies in schizophrenia and CHR cohorts, using different methods, have largely found this same pattern of disturbance in the structure of language and speech3. Most of these studies have been conducted in English, with notable exceptions including Dutch, Portuguese and Spanish4. It remains unknown, however, whether NLP findings obtained in English or other Indo-European languages would generalize to less similar languages, such as Mandarin, which has very different grammatical and prosodic conventions. This study included 20 help-seeking CHR youth and 25 healthy controls who were recruited as part of the US National Institute of Mental Health (NIMH)-funded Shanghai-At-Risk for Psychosis (SHARP) study at the Shanghai Mental Health Center, where institutional review board approval was obtained. Caseness and symptoms were determined using the Structured Interview for Psychosis-Risk Syndromes (SIPS)5. Subjects were Han Chinese and spoke Mandarin fluently, and they provided informed consent. Sex distribution was similar between CHR subjects and controls (55% vs. 48% female), but CHR subjects were younger (19.6±6.4 vs. 24.9±1.9 years) and had less education (11.4±4.0 vs. 16.7±1.4 years). Interviews were approximately 30 min in length, and were based on qualitative methods previously described6. They were transcribed verbatim in Mandarin and translated into English using Google Translate, with verification by bilingual researchers. Audio recordings were diarized (segmented by speaker using time stamps from transcription) so that acoustic analyses could be done of subjects' speech. NLP features analyzed for both English and Mandarin included coherence, complexity, and sentiment (i.e., emotional valence – positive, negative, neutral), as reported previously1, 7. For English NLP only, sentiment also included anger, fear, sadness, joy and disgust; frequency of wh-words (e.g., “which’) was also assessed. For Mandarin NLP only, frequency of measure words, possessives, and localizers (e.g., gongzuo-shang, “during work”; or liangge-ren-zhijian, “between two people”) was also calculated8. Acoustic features analyzed in Mandarin included those characteristic of schizophrenia or CHR states among English-speaking subjects, including abnormal pauses, flat intonation, voice breaks, and pitch variation7. All features were corrected for age and education by applying regression coefficients from healthy controls, and highly correlated features were removed from analysis. Machine learning classification was done using random forest and support vector machines (SVM) for Mandarin NLP, English NLP, and acoustics, with each experiment repeated 20 times, identifying the top five features of each model. Associations between linguistic features (cross-language analysis) and with symptoms (symptom inference) were also tested (see also supplementary information). Each of the three SVM machine learning classifiers showed high accuracy in discriminating spoken language in CHR subjects from that of healthy controls: English-specific NLP (95%), Mandarin-specific NLP (94%), and acoustic analysis (88%), with similar results for random forest. Top features for the English-specific NLP machine learning were wh-word and noun use (greater in CHR), and coherence, adjective use and adverb use (all less in CHR). Top features for the Mandarin-specific NLP machine learning were localizer use (greater in CHR), and positive sentiment, two metrics of coherence, and adjective use (all decreased in CHR). Of note, features common to the NLP machine learning for both languages were highly correlated, specifically coherence (r=.70) and adjective use (r=.60). For acoustics, the top features in the machine learning classifier were two pause metrics, and three indices of acoustic quality: chroma #11 (timbre/quality), bandwidth formant #1 (dysphonia/hoarseness), and spectral spread (energy – decibels/frequency). Of note, only acoustic features were significantly associated with symptoms (negative: r=0.69, p=8E–4, positive: r=0.49, p=3E–2) (see also supplementary information). Several important findings emerge from this proof-of-principle study. First, in Mandarin, spoken language can differentiate CHR subjects from healthy controls with high accuracy, using either linguistic or acoustic features. Second, the application of English-specific NLP to transcripts translated from Mandarin has utility, as there was comparable accuracy for both the English-specific and Mandarin-specific NLP. Further, there was overlap in top features in the two NLP classifiers, specifically decreases in adjective use and coherence, with both of these features highly correlated across the two languages, suggesting that these key metrics survive translation. Nonetheless, the application of Mandarin-specific NLP allowed the identification of a key linguistic feature that would not be captured otherwise – the increased use of localizers among CHR subjects – which may reflect concreteness or increased use of idioms; this is a new finding that merits replication and further investigation. Finally, the acoustic classifier, in addition to having high accuracy, identified features similar to those found in English-speaking CHR and schizophrenia cohorts, including abnormal pause behavior, and indices of voice quality and energy. As in prior studies, acoustic features were associated with symptoms, in particular negative symptoms. This study is the first to use natural language processing and acoustic analyses to characterize spoken language among native Mandarin speakers in China identified as at clinical high risk for psychosis. Our findings support the idea that there may be universal features of spoken language disturbance across psychosis and its risk states, particularly in respect to reduced coherence, but also word usage and pause behavior that may index reduced complexity. Yet, our study also shows that there are language-specific features characteristic of psychosis risk, suggesting that it is essential to also analyze spoken language using language-specific NLP methods. This is a small proof-of-principle study with the potential confounds of age and education, and none of the classifiers generated were cross-validated in a second cohort. Therefore, these findings should be investigated and replicated in a larger cohort of Mandarin-speaking CHR subjects and healthy controls who are more similar in demographics. More broadly, future studies should include a similar heuristic of using both English-based and language-specific NLP approaches, as well as acoustic analyses, to assess spoken language in CHR cohorts (e.g., English, Mandarin, Cantonese, Korean, Spanish, German, Portuguese, Danish, French, Italian) from around the world, as is planned for the Accelerating Medicines Partnership® Program – Schizophrenia, to determine both universal and language-specific features of language disturbance characteristic of clinical risk for psychosis. References 1Corcoran CM, Carrillo F, Fernandez-Slezak D et al. World Psychiatry 2018; 17: 67- 75. 2Stanislawski ER, Bilgrami ZR, Sarac C et al. NPJ Schizophr 2021; 7: 3. 3Corcoran CM, Cecchi G. Biol Psychiatry Cogn Neurosci Neuroimaging 2020; 5: 770- 9. 4Corcoran CM, Mittal VA, Bearden CE et al. Schizophr Res 2020; 226: 158- 66. 5McGlashan TH, Walsh BC, Woods SW. Structured Interview for Psychosis-Risk Syndromes. New Haven: PRIME Research Clinic, Yale School of Medicine, 2017. 6Ben-David S, Birnbaum ML, Eilenberg ME et al. Psychiatr Serv 2014; 65: 1499- 501. 7Agurto C, Cecchi GA, Norel R et al. Neuropsychopharmacology 2020; 45: 823- 32. 8Chappell H, Peyraube A. In: D Xu (ed). Space in languages of China: cross-linguistic, synchronic and diachronic perspectives. Dordrecht: Springer, 2008: 15- 37. Volume22, Issue1February 2023Pages 157-158 ReferencesRelatedInformation