Learning Sparse Functional Factors for Large-Scale Service Clustering.

Qi Yu,Hongbing Wang,Liang Chen
DOI: https://doi.org/10.1109/icws.2015.36
2015-01-01
Abstract:The past decade has witnessed a fast growth of web-based services, making discovery of user desired services from a large and diverse service space a fundamental challenge. Service clustering has been demonstrated as a promising solution by automatically detecting functionally similar services so that they can be searched and discovered together. In this way, both the efficiency and accuracy of service discovery can be improved. However, the autonomous nature of service providers leads to highly diverse usage of terms in their respective service descriptions. Furthermore, a typical service description is comprised of very limited terms due to the small number of (and focused) functionalities offered by the service. These unique characteristics make service descriptions different from regular text documents, which poses additional challenges when clustering large-scale services. Recent works show that service clustering can benefit from discovery and use of functionality-related latent factors to represent services as opposed to a large and diverse set of terms. Nonetheless, how to determine the total number of latent functional factors and sparsely assign them to each service description arises as a central challenge, especially for a large service space where there is no easy way to enumerate the types of different functionalities. In this paper, we propose a machine learning method that automatically learns the number of latent functional factors in a service space. It also enforces the sparsity constraint, which allows each service to be represented by a small number of latent functional factors. The sparsity constraint is in line with the fact that most real-world services only provide limited functionalities. We conduct extensive experiments on two sets of real-world service data to demonstrate the effectiveness of the proposed service clustering approach.
What problem does this paper attempt to address?