Abstract:Purpose The purpose of this paper is to propose a fourfold semantic similarity that results in more accuracy compared to the existing literature. The change detection in the URL and the recommendation of the source documents is facilitated by means of a framework in which the fourfold semantic similarity is implied. The latest trends in technology emerge with the continuous growth of resources on the collaborative web. This interactive and collaborative web pretense big challenges in recent technologies like cloud and big data. Design/methodology/approach The enormous growth of resources should be accessed in a more efficient manner, and this requires clustering and classification techniques. The resources on the web are described in a more meaningful manner. Findings It can be descripted in the form of metadata that is constituted by resource description framework (RDF). Fourfold similarity is proposed compared to three-fold similarity proposed in the existing literature. The fourfold similarity includes the semantic annotation based on the named entity recognition in the user interface, domain-based concept matching and improvised score-based classification of domain-based concept matching based on ontology, sequence-based word sensing algorithm and RDF-based updating of triples. The aggregation of all these similarity measures including the components such as semantic user interface, semantic clustering, and sequence-based classification and semantic recommendation system with RDF updating in change detection. Research limitations/implications The existing work suggests that linking resources semantically increases the retrieving and searching ability. Previous literature shows that keywords can be used to retrieve linked information from the article to determine the similarity between the documents using semantic analysis. Practical implications These traditional systems also lack in scalability and efficiency issues. The proposed study is to design a model that pulls and prioritizes knowledge-based content from the Hadoop distributed framework. This study also proposes the Hadoop-based pruning system and recommendation system. Social implications The pruning system gives an alert about the dynamic changes in the article (virtual document). The changes in the document are automatically updated in the RDF document. This helps in semantic matching and retrieval of the most relevant source with the virtual document. Originality/value The recommendation and detection of changes in the blogs are performed semantically using n-triples and automated data structures. User-focussed and choice-based crawling that is proposed in this system also assists the collaborative filtering. Consecutively collaborative filtering recommends the user focussed source documents. The entire clustering and retrieval system is deployed in multi-node Hadoop in the Amazon AWS environment and graphs are plotted and analyzed.

Ameliorating Search Results Recommendation System Based on K-Means Clustering Algorithm and Distance Measurements

A Hybrid Recommendation Algorithm Based on Clustering and Collaborative Filtering

A new collaborative filtering recommendation algorithm based on dimensionality reduction and clustering techniques

Application of k-means clustering algorithm to improve effectiveness of the results recommended by journal recommender system

Application of Improved K-Means Algorithm in Collaborative Recommendation System

Clustering Web Search Results For Effective Arabic Language Browsing

Improving the methord of collaborative filtering by integrating semantic and temporal factors and the methord of cluster analysis.

A Hybrid News Recommendation Algorithm Based On K-means Clustering and Collaborative Filtering

E-commerce User Recommendation Algorithm Based on Social Relationship Characteristics and Improved K-Means Algorithm

Improving search result clustering using nature inspired approach

Efficient clustering in data mining applications based on harmony search and k-medoids

Information Retrieval in long documents: Word clustering approach for improving Semantics

Semantic tracking and recommendation using fourfold similarity measure from large scale data using hadoop distributed framework in cloud

Building a recommendation system based on the job offers extracted from the web and the skills of job seekers

An Effective Clustering-Based Web Page Recommendation Framework for E-Commerce Websites

E-commerce recommender system based on improved K-means commodity information management model

Scalable Collaborative Filtering Based on Splitting-Merging Clustering Algorithm

Enhancing Recommender System performance through the fusion of Fuzzy C-Means, Restricted Boltzmann Machine, and Extreme Learning Machine

Tag-Aware Document Representation for Research Paper Recommendation

A Distributed Collaborative Filtering Algorithm Using Multiple Data Sources

State of the art document clustering algorithms based on semantic similarity