Exploring Artist Gender Bias in Music Recommendation

Dougal Shakespeare,Lorenzo Porcaro,Emilia Gómez,Carlos Castillo
DOI: https://doi.org/10.48550/arXiv.2009.01715
2020-10-06
Abstract:Music Recommender Systems (mRS) are designed to give personalised and meaningful recommendations of items (i.e. songs, playlists or artists) to a user base, thereby reflecting and further complementing individual users' specific music preferences. Whilst accuracy metrics have been widely applied to evaluate recommendations in mRS literature, evaluating a user's item utility from other impact-oriented perspectives, including their potential for discrimination, is still a novel evaluation practice in the music domain. In this work, we center our attention on a specific phenomenon for which we want to estimate if mRS may exacerbate its impact: gender bias. Our work presents an exploratory study, analyzing the extent to which commonly deployed state of the art Collaborative Filtering(CF) algorithms may act to further increase or decrease artist gender bias. To assess group biases introduced by CF, we deploy a recently proposed metric of bias disparity on two listening event datasets: the LFM-1b dataset, and the earlier constructed Celma's dataset. Our work traces the causes of disparity to variations in input gender distributions and user-item preferences, highlighting the effect such configurations can have on user's gender bias after recommendation generation.
Information Retrieval
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to explore whether music recommender systems (mRS) will exacerbate gender bias among artists. Specifically, the researchers are concerned about whether the currently widely - used Collaborative Filtering (CF) algorithm will further increase or decrease users' preference biases towards artists of different genders when generating recommendations. #### Main problems: 1. **The influence of gender bias in music recommendation**: Although the accuracy metric is widely used in evaluating the performance of recommender systems, evaluating the utility of recommender systems from other perspectives (such as social impact, fairness, etc.) is still an emerging research area. This paper specifically focuses on the phenomenon of gender bias, that is, whether the recommender system will amplify the existing gender inequality. 2. **Evaluation of existing data sets and algorithms**: The researchers used two publicly available music listening event data sets (LFM - 1b and LFM - 360k) and introduced a new bias metric - "Bias Disparity" to evaluate the performance of the CF algorithm on these data sets. Bias Disparity is defined as the relative difference between the preference ratio of the recommender system output and the preference ratio of the input. #### Research objectives: - **Evaluate the impact of the Collaborative Filtering algorithm**: Verify through experiments whether the commonly used CF algorithm will exacerbate users' gender bias, thereby affecting the exposure rate and proportional representation of artists. - **Explore the propagation mechanism of gender bias**: Analyze how the gender distribution and user preferences in the input data affect gender bias in the recommendation results. ### Method overview: 1. **Data set selection**: Use two publicly available music listening data sets - LFM - 1b and LFM - 360k, which contain a large number of users' listening records. 2. **Recommendation algorithm**: Tested several common CF algorithms, including UserKNNAvg based on neighborhoods and NMF based on matrix factorization, and set up benchmark algorithms (such as MostPopular and UserItemAvg) for comparison. 3. **Evaluation metric**: In addition to the traditional accuracy metrics (such as Precision and nDCG), Bias Disparity was also introduced as a core evaluation metric to quantify the performance of the recommender system in terms of gender bias. ### Experimental design: - **Experiment 1**: Simulate real - world scenarios, generate recommendation lists for all users with identifiable genders, and evaluate the propagation of gender bias under the actual data distribution. - **Experiment 2**: Simulate extreme preference scenarios, generate recommendations only for those users who have extreme preferences for artists of a certain gender, and evaluate the change in gender bias in this extreme situation. ### Conclusion: Through the above methods, the researchers hope to reveal the behavior patterns of the recommender system when dealing with users' gender bias and provide a theoretical basis for future improvements. In particular, they hope that this research can draw people's attention to the problem of gender bias in music recommender systems and promote the design of more fair recommendation algorithms.