Cold Start Similar Artists Ranking with Gravity-Inspired Graph Autoencoders

Guillaume Salha-Galvan,Romain Hennequin,Benjamin Chapus,Viet-Anh Tran,Michalis Vazirgiannis
DOI: https://doi.org/10.48550/arXiv.2108.01053
2021-08-03
Abstract:On an artist's profile page, music streaming services frequently recommend a ranked list of "similar artists" that fans also liked. However, implementing such a feature is challenging for new artists, for which usage data on the service (e.g. streams or likes) is not yet available. In this paper, we model this cold start similar artists ranking problem as a link prediction task in a directed and attributed graph, connecting artists to their top-k most similar neighbors and incorporating side musical information. Then, we leverage a graph autoencoder architecture to learn node embedding representations from this graph, and to automatically rank the top-k most similar neighbors of new artists using a gravity-inspired mechanism. We empirically show the flexibility and the effectiveness of our framework, by addressing a real-world cold start similar artists ranking problem on a global music streaming service. Along with this paper, we also publicly release our source code as well as the industrial graph data from our experiments.
Machine Learning,Information Retrieval,Social and Information Networks
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to solve the "Cold Start Similar Artists Ranking" problem in music streaming services. Specifically, when new artists release their works on the platform for the first time, due to the lack of sufficient user interaction data (such as the number of plays, likes, etc.), traditional recommendation systems based on usage data cannot generate a "similar artists" list for these new artists. This makes it difficult for new artists to be discovered by users and also limits the coverage and fairness of the platform's recommendation system. #### Specific problem description 1. **Cold - start problem**: When new artists release their works for the first time, the platform does not have enough user interaction data to calculate their similarity with other artists. Therefore, it is impossible to generate a "similar artists" list for new artists. 2. **Limitations of the recommendation system**: Existing recommendation systems mainly rely on "hot" artists with a large amount of existing user interaction data, and are powerless for newly - added "cold" artists. 3. **Fairness problem**: Due to the lack of sufficient user interaction data, many new artists or niche artists cannot enter the recommendation system, resulting in the recommendation results being biased towards popular artists, which affects the diversity and fairness of the recommendation. #### Solution The author proposes a method based on Graph Autoencoders, using a gravity - inspired mechanism to predict the similar artists list of new artists. The specific steps are as follows: 1. **Construct a directed weighted graph**: Take each artist as a node in the graph, construct directed edges according to the existing similarity scores, and assign edge weights. 2. **Graph autoencoder learning**: Learn the node embedding representation through the graph autoencoder to capture the graph structure and node attribute information. 3. **Gravity - inspired decoder**: Use the gravity - inspired mechanism to predict the similarity scores between new artists and existing artists, thereby generating a similar artists list. Through this method, even in the absence of user interaction data, a reasonable similar artists recommendation list can be generated for new artists, improving the coverage and fairness of the recommendation system. #### Paper contributions - Propose a novel gravity - inspired graph autoencoder framework to solve the cold - start similar artists ranking problem. - Conduct experimental verification on actual music streaming service data, proving the effectiveness of this method. - Publish the experimental code and industrial - level graph data, promoting future research and development. Hope this summary can help you understand the core problems and solutions of the paper. If you have more questions or need further explanation, please feel free to let me know!