Abstract:Now a days, the text document is spontaneously increasing over the internet, e-mail and web pages and they are stored in the electronic database format. To arrange and browse the document it becomes difficult. To overcome such problem the document preprocessing, term selection, attribute reduction and maintaining the relationship between the important terms using background knowledge, WordNet, becomes an important parameters in data mining. In these paper the different stages are formed, firstly the document preprocessing is done by removing stop words, stemming is performed using porter stemmer algorithm, word net thesaurus is applied for maintaining relationship between the important terms, global unique words, and frequent word sets get generated, Secondly, data matrix is formed, and thirdly terms are extracted from the documents by using term selection approaches tf-idf, tf-df, and tf2 based on their minimum threshold value. Further each and every document terms gets preprocessed, where the frequency of each term within the document is counted for representation. The purpose of this approach is to reduce the attributes and find the effective term selection method using WordNet for better clustering accuracy. Experiments are evaluated on Reuters Transcription Subsets, wheat, trade, money grain, and ship, Reuters 21578, Classic 30, 20 News group (atheism), 20 News group (Hardware), 20 News group (Computer Graphics) etc.

Word Distributed Representation Based Text Clustering.

Co-Clustering With Manifold And Double Sparse Representation

Shortest Path And Word Vector Based Relation Representation And Clustering

A Semantic approach for effective document clustering using WordNet

Information Retrieval in long documents: Word clustering approach for improving Semantics

A Novel Text Clustering Algorithm Based on Inner Product Space Model of Semantic

CDW: A Text Clustering Model for Diverse Versions Discovery.

Topic Modeling Using Distributed Word Embeddings

Clustering Text Data Streams

Web Service Clustering Method Based on Word Vector and Biterm Topic Model

Distributed Data Stream Clustering: A Fast EM-based Approach

Document Representation with Statistical Word Senses in Cross-Lingual Document Clustering

Clustering web images by correlation mining of image-text

Distributional Character Clustering For Chinese Text Categorization

Representing Document As Dependency Graph for Document Clustering

Document Clustering Based on Word Sense Cluster

DRWS: A Model for Learning Distributed Representations for Words and Sentences.

A Clustering Algorithm for Short Documents Based On Concept Similarity

Grouped Text Clustering Using Non-Parametric Gaussian Mixture Experts

Concept chain based text clustering

A Web Service Clustering Method Based on Semantic Similarity and Multidimensional Scaling Analysis