Demographical Gender Prediction of Twitter Users Using Big Data Analytics: an Application of Decision Marketing

Sudipta Roy,Bhavya Patel,Debnath Bhattacharyya,Kushal Dhayal,Tai-Hoon Kim,Mamta Mittal
DOI: https://doi.org/10.1504/ijris.2021.114629
2021-01-01
International Journal of Reasoning-based Intelligent Systems
Abstract:Twitter text is difficult to analyse due to the non-standard and unstructured data. Twitter does not accumulate user gender information as do other popular social media platforms. The demographic feature prediction and additional informative content are important for advertising, custom-made marketing and authorised investigation from the social medium. The proposed statistical method with real-time analysis using big data technologies is able to predict the gender of Twitter users. Gender prediction is performed using the naive Bayes classifier to address systemic issues, and Apache Hive is used to solve data cleaning, storage and processing issues. The proposed method is a speedy, easy-to-implement with pre-processing, close to state-of-the-art document text categorisation method using big data technologies.
What problem does this paper attempt to address?