Analyzing the Extent to which Gender Bias Exists in News Articles Using Natural Language Processing Techniques
Nihita Guda
DOI: https://doi.org/10.47611/jsrhs.v12i1.3865
2023-02-28
Journal of Student Research
Abstract:Prior studies have shown the existence of gender bias in job postings, performance reviews, and letters of recommendation. However, very little research has been done on the presence of gender biases in mainstream news sources and how they vary across publications. Human editing, given the rapid pace of news dissemination, is not effective enough to address biases. Even computer programs that parse the news articles for specific words and references still fall short of identifying and detecting the undertones and implicit references, which is why sophisticated techniques like Artificial Intelligence (AI) are necessary. In this study, I used Natural Language Processing (NLP) methods, a series of Python-programs to further analyze how biases vary in new information, along the metrics of type, variety, and intensity. I used over 500,000 news articles from 15 publications, spanning over 4 years to build and train my algorithm. Using Word2Vec, a popular NLP method, I was able to conclude that more right leaning publications are more likely to exhibit misogynistic content that is biased against women. However, the method fell short of identifying many forms of objectification like Benevolent Sexism. Similarly, using VADER, a python-code of sentiment analysis tool, I was able to determine that mere metrics of positive, negative, and neutral sentiment are not sufficient to detect occurrences of gender bias. To gauge the breadth of sexist language effectively, I used the LIWC text analysis program which calculates the percentage of words in a given text that fall into one or more of over 80 linguistic, psychological and topical categories indicating various social, cognitive, and affective processes. As a result, with statistical evidence my study was able to conclude the presence of implicit gender bias occurs all across publications but is more prevalent in right-leaning publications.
English Else