Abstract:Machine learning for text classification is the underpinning of document cataloging, news filtering, document steering and exemplification. In text mining realm, effective feature selection is significant to make the learning task more accurate and competent. One of the traditional lazy text classifier k-Nearest Neighborhood (kNN) has a major pitfall in calculating the similarity between all the objects in training and testing sets, there by leads to exaggeration of both computational complexity of the algorithm and massive consumption of main memory. To diminish these shortcomings in viewpoint of a data-mining practitioner an amalgamative technique is proposed in this paper using a novel restructured version of kNN called AugmentedkNN(AkNN) and k-Medoids(kMdd) <a class="link-external link-http" href="http://clustering.The" rel="external noopener nofollow">this http URL</a> proposed work comprises preprocesses on the initial training set by imposing attribute feature selection for reduction of high dimensionality, also it detects and excludes the high-fliers samples in the initial training set and restructures a constrictedtraining set. The kMdd clustering algorithm generates the cluster centers (as interior objects) for each category and restructures the constricted training set with centroids. This technique is amalgamated with AkNNclassifier that was prearranged with text mining similarity measures. Eventually, significantweights and ranks were assigned to each object in the new training set based upon their accessory towards the object in testing set. Experiments conducted on Reuters-21578 a UCI benchmark text mining data set, and comparisons with traditional kNNclassifier designates the referredmethod yieldspreeminentrecitalin both clustering and classification.

TCUAP: A Novel Approach of Text Clustering Using Asymmetric Proximity.

Text Clustering Based on Asymmetric Similarity

Semantic Correlation Network Based Text Clustering

DIAS: A Disassemble-Assemble Framework for Highly Sparse Text Clustering

Improved ROCK for text clustering using asymmetric proximity

Co-Clustering With Manifold And Double Sparse Representation

Concept chain based text clustering

Quantum clustering — A novel method for text analysis

A Novel Self-Adaptive Affinity Propagation Clustering Algorithm Based on Density Peak Theory and Weighted Similarity

A Lda-Based Algorithm For Length-Aware Text Clustering

A Linguistic Feature Based Text Clustering Method.

A Clustering Model for Three-Way Asymmetric Proximities: Unveiling Origins and Destinations

A Novel Text Clustering Algorithm Based on Inner Product Space Model of Semantic

Subspace Clustering of Very Sparse High-Dimensional Data

Constrained Coclustering for Textual Documents.

An Efficient Clustering Algorithm for Small Text Documents

Affinity propagation clustering on oral conversation texts

Novel text categorization by amalgamation of augmented k-nearest neighborhood classification and k-medoids clustering

A Clustering Algorithm for Short Documents Based On Concept Similarity

Clustering Text Data Streams

Research on a Text Data Preprocessing Method Suitable for Clustering Algorithm