Abstract:Due to the rapid growth of web services in repositories, discovering the requisite web service is becoming increasingly cumbersome task. It has raised the demand for efficient web service clustering algorithms. In service repositories, when related web services are stored in a clustered way, it enhances the web service discovery process by reducing search space and time. Many eminent researchers have worked in this field and used the Term Frequency – Inverse Document Frequency (TF-IDF) method for representing web services in vector space. In general, there are various limitations of the TF-IDF approach i.e. 1) Not efficient for large documents 2) Position of term and its co-occurrences does not matter 3) Unable to analyze how terms are dispersed in different documents. In the web service scenario, services are represented in short text form. TF-IDF does not work well in web service representation because of the reason that it is unable to effectively find the importance of a term concerning its occurrence in other documents. If we compare two service documents i.e. 's1' and 's2' first having a large and second having small number of terms respectively then TF-IDF does not demonstrate the importance of terms in 's1' as smaller to 's2'. Therefore, it is not possible to assign effective weights to the terms. In the lack of effective vector space representation, the performance of the clustering algorithm also degrades. In this paper, we propose a new approach i.e. LFW+K which is based on Length Feature Weight (LFW) for the vectorized representation of service followed by K-Means clustering. The proposed approach helps to find the informative term from web service and assigns the term weight accordingly by considering parameters like the dimension of the web service document, maximum frequency of a term in the document and occurrences of a term in other documents. LFW+K is applied on the datasets of real-world web services and the performance is measured using standard measurement criteria (i.e. precision, recall, F1-score, and accuracy). Results of the proposed approach are compared with K-Means Clustering on TF-IDF representation method i.e. TF-IDF+K. Results show that the proposed method outperforms the clustering done by using TF-IDF method for vector space representation of web services.

Enhancing web service clustering using Length Feature Weight Method for service description document vector space representation

Improve Semantic Web Services Discovery Through Similarity Search in Metric Space

A Web Service Clustering Method Based on Semantic Similarity and Multidimensional Scaling Analysis

Web Service Clustering Method Based on Word Vector and Biterm Topic Model

A Method to Enhance Web Service Clustering by Integrating Label-Enhanced Functional Semantics and Service Collaboration

Learning Sparse Functional Factors for Large-Scale Service Clustering.

Semantic Web Service Discovery Based on LDA Clustering.

Clustering Web Services to Facilitate Service Discovery

Performance Evaluation of Semantic Approaches for Automatic Clustering of Similar Web Services

Wt-Lda: User Tagging Augmented Lda For Web Service Clustering

Locality Sensitive Hashing Based Service Classification

Utilizing Semantic Information from Linked Open Data in Web Service Clustering

A Service Clustering Method Based on Wisdom of Crowds.

A Web service clustering method based on topic enhanced Gibbs sampling algorithm for the Dirichlet Multinomial Mixture model and service collaboration graph

New Clustering-Based Semantic Service Selection and User Preferential Model

WTCluster: utilizing tags for web services clustering

Feature Weighting Information-Theoretic Co-Clustering for Document Clustering

CWC: A Clustering-Based Feature Weighting Approach for Text Classification

CluCF: a Clustering CF Algorithm to Address Data Sparsity Problem

Weighted Intuitionistic Fuzzy C-Means Clustering Algorithms

Web service discovery among large service pools utilising semantic similarity and clustering