Identifying e-cigarette content on TikTok: Using a BERTopic Modeling approach

Juhan Lee,Rachel R Ouellette,Dhiraj Murthy,Ben Pretzer,Tanvi Anand,Grace Kong
DOI: https://doi.org/10.1093/ntr/ntae171
2024-07-13
Abstract:Introduction: The use of hashtags is a common way to promote e-cigarette content on social media. Analysis of hashtags may provide insight into e-cigarette promotion on social media. However, the examination of text data is complicated by the voluminous amount of social media data. This study used machine learning approaches (i.e., Bidirectional Encoder Representations from Transformers [BERT] topic modeling) to identify e-cigarette content on TikTok. Methods: We used 13 unique hashtags related to e-cigarettes (e.g., #vape) for data collection. The final analytic sample included 12,573 TikTok posts. To identify the best fitting number of topic clusters, we used both quantitative (i.e., coherence test) and qualitative approaches (i.e., researchers checked the relevance of text from each topic). We, then, grouped and characterized clustered text to each theme. Results: We evaluated that N=18 was the ideal number of topic clusters. The 9 overarching themes were identified: Social media and TikTok-related features (N=4; "duet", "viral"), Vape shops and brands (N=3; "store"), Vape tricks (N=3; "ripsaw"), Modified use of e-cigarettes (N=1; "coil", "wire"), Vaping and girls (N=1; "girl"), Vape flavors (N=1; "flavors"), Vape and cigarettes (N=1; "smoke"), Vape identities and communities (N=1; "community"), and Non-English language (N=3; Romanian and Spanish). Conclusions: This study used a machine learning method, BERTopic modeling, to successfully identify relevant themes on TikTok. This method can inform future social media research examining other tobacco products, and tobacco regulatory policies such as monitoring of e-cigarette marketing on social media. Implications: This study can inform future social media research examining other tobacco products, and tobacco regulatory policies such as monitoring of e-cigarette marketing on social media.
What problem does this paper attempt to address?