Abstract:Machine learning for text classification is the underpinning of document cataloging, news filtering, document steering and exemplification. In text mining realm, effective feature selection is significant to make the learning task more accurate and competent. One of the traditional lazy text classifier k-Nearest Neighborhood (kNN) has a major pitfall in calculating the similarity between all the objects in training and testing sets, there by leads to exaggeration of both computational complexity of the algorithm and massive consumption of main memory. To diminish these shortcomings in viewpoint of a data-mining practitioner an amalgamative technique is proposed in this paper using a novel restructured version of kNN called AugmentedkNN(AkNN) and k-Medoids(kMdd) <a class="link-external link-http" href="http://clustering.The" rel="external noopener nofollow">this http URL</a> proposed work comprises preprocesses on the initial training set by imposing attribute feature selection for reduction of high dimensionality, also it detects and excludes the high-fliers samples in the initial training set and restructures a constrictedtraining set. The kMdd clustering algorithm generates the cluster centers (as interior objects) for each category and restructures the constricted training set with centroids. This technique is amalgamated with AkNNclassifier that was prearranged with text mining similarity measures. Eventually, significantweights and ranks were assigned to each object in the new training set based upon their accessory towards the object in testing set. Experiments conducted on Reuters-21578 a UCI benchmark text mining data set, and comparisons with traditional kNNclassifier designates the referredmethod yieldspreeminentrecitalin both clustering and classification.

Text Categorization Via Attribute Distance Weighted K-Nearest Neighbor Classification.

A refined weighted K-Nearest Neighbors algorithm for text categorization

Accelerated K-Nearest Neighbors Algorithm Based on Principal Component Analysis for Text Categorization

An adaptive k-nearest neighbor text categorization strategy

Novel text categorization by amalgamation of augmented k-nearest neighborhood classification and k-medoids clustering

Algorithm 1 : Attribute Selection Algorithm Based on Mutual Dependency Input

An Improved K-Nearest Neighbor Algorithm for Text Categorization

An Improved KNN Text Categorization Method Based on Spanning Tree Documents Clustering

Optimized K-Nn Text Categorization Approach

A kNN Text Categorization Algorithm Base on χ~2 Statistic

An adaptive fuzzy kNN text classifier

STUDY ON THE APPLICATION OF FUZZY kNN TO TEXT CATEORIZATION

The Weighted KNN Text Categorization Algorithm Based on Training Set Cutting

Text Categorization via Similarity Search: An Efficient and Effective Novel Algorithm

A Non-VSM kNN algorithm for text classification

An Efficient Algorithm for Large-Scale Text Categorization

Centroid Training to Achieve Effective Text Classification

Efficient KNN Text Categorization Based on Multiedit and Condensing Techniques

An Efficient Text Categorization Algorithm Based on Category Memberships

Building a Simple and Effective Text Categorization System using Relative Importance in Category

Improved KNN Text Categorization