How LinkedIn Economic Graph Bonds Information and Product: Applications in LinkedIn Salary

Xi Chen,Yiqun Liu,Liang Zhang,Krishnaram Kenthapadi
DOI: https://doi.org/10.48550/arXiv.1806.09063
2018-06-24
Social and Information Networks
Abstract:The LinkedIn Salary product was launched in late 2016 with the goal of providing insights on compensation distribution to job seekers, so that they can make more informed decisions when discovering and assessing career opportunities. The compensation insights are provided based on data collected from LinkedIn members and aggregated in a privacy-preserving manner. Given the simultaneous desire for computing robust, reliable insights and for having insights to satisfy as many job seekers as possible, a key challenge is to reliably infer the insights at the company level when there is limited or no data at all. We propose a two-step framework that utilizes a novel, semantic representation of companies (Company2vec) and a Bayesian statistical model to address this problem. Our approach makes use of the rich information present in the LinkedIn Economic Graph, and in particular, uses the intuition that two companies are likely to be similar if employees are very likely to transition from one company to the other and vice versa. We compute embeddings for companies by analyzing the LinkedIn members' company transition data using machine learning algorithms, then compute pairwise similarities between companies based on these embeddings, and finally incorporate company similarities in the form of peer company groups as part of the proposed Bayesian statistical model to predict insights at the company level. We perform extensive validation using several different evaluation techniques, and show that we can significantly increase the coverage of insights while, in fact, even improving the quality of the obtained insights. For example, we were able to compute salary insights for 35 times as many title-region-company combinations in the U.S. as compared to previous work, corresponding to 4.9 times as many monthly active users. Finally, we highlight the lessons learned from deployment of our system.
What problem does this paper attempt to address?