Finding Best Possible Number of Clusters using K-Means Algorithm

DOI: https://doi.org/10.35940/ijeat.a1119.1291s419
2019-12-30
International Journal of Engineering and Advanced Technology
Abstract:Customers are assets for business. The companies are investing more for customer relationship management. Retaining customer for long time is a difficult process in today’s trend. On line shopping is also increasing day by day. People are more interested to visit popular web sites and they are spending very less time to choose their products. On line shops are paying more interest to analyze customer preferences, their needs, shopping behaviors through data mining technique. Proper classification is necessary for organizing such data. In this work, Customer with the same buying behavior is grouped based on the features age and salary. K-Means algorithm is applied to form clusters with different K values for original data and normalized data. The within sum of square (wss) is calculated for both the data for different cluster size. The minimum wss is considered to be better which is achieved in normalized data. The validity of cluster is evaluated by elbow, silhouette and gap statistic method to choose the optimal number of clusters. This work is implemented in R software.
What problem does this paper attempt to address?