An Algorithm for the Removal of Redundant Dimensions to Find Clusters in N-Dimensional Data Using Subspace Clustering

Muhammad Shahbaz,Syed Muhammad Ahsen,Ishtiaq Hussain,Muhammad Ashraf Shaheen,Syed Athar Masood
2011-01-01
Abstract:The data mining has emerged as a powerful tool to extract knowledge from huge databases. Researchers have introduced several machine learning algorithms to explore the databases to discover information, hidden patterns, and rules from the data which were not known at the data recording time. Due to the remarkable developments in the storage capacities, processing and powerful algorithmic tools, practitioners are developing new and improved algorithms and techniques in several areas of data mining to discover the rules and relationship among the attributes in simple and complex higher dimensional databases. Furthermore data mining has its implementation in large variety of areas ranging from banking to marketing, engineering to bioinformatics and from investment to risk analysis and fraud detection. Practitioners are analyzing and implementing the techniques of artificial neural networks for classification and regression problems because of accuracy, efficiency. The aim of his short research project is to develop a way of identifying the clusters in high dimensional data as well as redundant dimensions which can create a noise in identifying the clusters in high dimensional data. Techniques used in this project utilizes the strength of the projections of the data points along the dimensions to identify the intensity of projection along each dimension in order to find cluster and redundant dimension in high dimensional data. (Dr. Muhammad Shahbaz, Dr Syed Ahsan, Ishtiaq Hussain, Muhammad Shaheen, Syed Athar Masood. An Algorithm for the Removal of Redundant Dimensions to Find Clusters in N-Dimensional Data Using Subspace Clustering. Journal of American Science 2011;7(6):956-964). (ISSN: 1545-1003). http://www.americanscience.org.
What problem does this paper attempt to address?