Transfer Learning for Cross-Platform Software Crowdsourcing Recommendation.

Shuhan Yan,Beijun Shen,Wenkai Mo,Ning Li
DOI: https://doi.org/10.1109/apsec.2017.33
2017-01-01
Abstract:Recently, with the development of software crowd sourcing industry, an increasing number of users joined the software crowdsourcing platforms to publish software project tasks or to seek proper work opportunities. One of competitive functions of these platforms is to recommend proficient projects to developers. However, in such recommender system, there exists a serious platform cold-start problem, especially for new software crowdsourcing platforms, as they usually have too little cumulative data to support accurate model training and prediction. This paper focuses on solving the platform cold-start problem in software crowdsourcing recommendation system by transfer learning technologies. We proposed a novel cross-platform recommendation method for new software crowdsourcing platforms, whose idea is trying to transfer data and knowledge from other mature software crowdsourcing platforms (source domains) to solve the insufficient recommendation model training problem in a new platform (target domain). The proposed method maps different kinds of features both in the source domain and the target domain after a certain transformation and combination to a latent space by learning the correspondences between features. Specifically, our method is an instance of content-based recommendation, which uses tags and keywords extracted from project description in crowdsourcing platforms as features, and then set weights for each feature to reflect its importance. Then, Weight-SCL is proposed to merge and distinguish tag features and keyword features before doing feature mapping and data migration to implement knowledge transformation. Finally, we use the data from two famous software crowdsourcing platform as dataset, and a series of experiments are conducted to evaluate the performance of the multi-source recommendation system in comparison with the baseline methods, and get 1.2X performance promotion.
What problem does this paper attempt to address?