Abstract:In recent years, there is a growing interest in using social media to understand social phenomena. Researchers have demonstrated many important applications of usingonline social media to understand real world events, such as presidential election prediction, earthquake early detection, and disaster management. A social media site is mixed with different types of users, in terms of gender, location, ideology, and etc. Different types of users may have different motivations, different opinions towards certain topics, different resources at their disposal, different behaviors in events. If researchers want to understand what is happening on a social media site, it is important to know where a post comes from, who wrote this post, and which party the authorbelongs to. However, this information is not explicitly provided by users. In this thesis, the goal is to predict users’ latent attributes such as their locations, social identities, and political orientations. Thanks to the massive text data on social media, we can learn rich knowledge from text to predict users’ attributes. In the meanwhile, text data from social media often comes with a significant amount of metadata. Furthermore, data from social networks also contains rich connection information, eg. mentioning, following. It is still a challenge task to combine text, meta data, usernetwork together for user attributes prediction.In this thesis, I approach user attributes prediction at three levels — single post, user timeline, graph-level classification. I start with a global location prediction system that uses one single tweet as input to learn one user’s location. It utilizeslocation-related features in a tweet, such as text and user profile metadata. I extend the tweet-level prediction system to user-level, which combines multiple posts in oneuser’s timeline. I demonstrate the effectiveness of this model on the task of user social identity classification. An improved user-level hierarchical location prediction model is also presented. In these described models, I mainly focus on learning user attributes from users themselves. In the next step, I consider social graph as additional information to improve performance. Users connected in a social network often show similarities in certain aspects, which is a well-known phenomenon called social homophily. Experiments demonstrate that combining a social graph dramatically improves the performance of our prediction system compared to the previous user-level method. As a case study of the attributes prediction system, I apply the method on a real world emergency event — the novel coronavirus outbreak starting from 2019. I demonstrate that we gain better understanding of the public conversation during thisglobal emergency event.

Augmenting Input Method Language Model with user Location Type Information

Location Inference for Non-geotagged Tweets in User Timelines

Location Inference for Non-Geotagged Tweets in User Timelines [Extended Abstract]

Understanding User Behavior of Asking Location-Based Questions on Microblogs

A Location Inferring Model Based on Tweets and Bilateral Follow Friends.

Estimation of User Location and Local Topics Based on Geo-tagged Text Data on Social Media

Fine-grained Geolocation Prediction of Tweets with Human Machine Collaboration

HisRect: Features from Historical Visits and Recent Tweet for Co-Location Judgement

Mining geographic knowledge using location aware topic model

Localize Online Social Network User Via Social Sensing

Effective location identification from microblogs

A BiLSTM-CNN Model for Predicting Users’ Next Locations Based on Geotagged Social Media

Geosocial Location Classification: Associating Type to Places Based on Geotagged Social-Media Posts

Location Prediction in Social Media Based on Contents and Graphs

Learning User Latent Attributes on Social Media

Geolocation differences of language use in urban areas

Predicting Mobile Users' Next Location Using the Semantically Enriched Geo-Embedding Model and the Multilayer Attention Mechanism

Multiview Deep Learning for Predicting Twitter Users' Location

Discovery of Relevance Between Location and Topics of Micro-blogs

Location and Trajectory Identification from Microblogs.

Location Prediction with Communities in User Ego-Net in Social Media