Exploring a Language-Based Interest Assessment: Predicting Vocational Interests on Social Media Using Natural Language Processing
Yan Yi Lance Du,Devansh Jain,Young-Min Cho,Daphne Xin Hou,Sharath Chandra Guntuku,Lyle Ungar,Louis Tay
DOI: https://doi.org/10.1177/10690727241289125
2024-10-16
Journal of Career Assessment
Abstract:Journal of Career Assessment, Ahead of Print. For more than a century, self-report inventories have been the traditional method for assessing vocational interests. Little research has examined the use of machine learning techniques, such as natural language processing (NLP), in interest assessment. This paper explores the extent to which natural language on social media can be used to predict individuals' self-ratings on eight basic interests representing the SETPOINT model: Agriculture, Engineering, Human Resources, Life Science, Management/Administration, Mechanics/Electronics, Media, and Social Science. Leveraging closed- (Linguistic Inquiry and Word Count; LIWC) and open-vocabulary NLP approaches (Latent Dirichlet allocation (LDA) topic modeling), we analyzed 3.2 million Facebook posts from 2,834 participants who completed a 32-item basic interest measure. We found that the convergent validities of these NLP approaches for predicting vocational interest scores (LIWC: [math] = 0.19; LDA topic modeling: [math] = 0.24) are comparable to prior research on language-based personality assessments. Our study also revealed largely face-valid language markers that characterize different basic interests. Implications for developing language-based interest assessments for applied settings (e.g., career guidance and employee selection) and future research directions are discussed.
psychology, applied