Abstract:Clustering is the process of grouping a set of objects into classes of similar objects. Until recently, the concept of similarity is based on dis- tances, e.g Euclidean distance and cosine distance. Our previous work on -cluster and -pCluster designed new similarity models to capture subspace coherency exhibited in data and focused on shifting patterns or scaling patterns. Along the same general direction, we propose a more flexible yet powerful clustering model, namely u-Cluster (Up- pattern Cluster). Under this model, two objects are similar in a subset of dimensions if there exist a permutation of these dimensions, along which both objects exhibit a consistent 'up' pattern. For instance, in DNA microarray analysis, the expression levels of two genes can rise synchronously in response to a sequence of environment stimuli. Al- though the magnitude of their expression levels might not be close and the amount by which they rise might not be equivalent, the 'up' pat- terns that they exhibit can be consistent. Discovery of such clusters of genes is essential in revealing significant connections in gene regula- tory networks. In addition, E-Commerce applications such as collabo- rative filtering and stock analysis can also benefit from this model for identifying customer groups that have consistent trends in interests or activities (purchasing, browsing, etc). We also devise an efficient algo- rithm that takes advantage of fast sequential pattern mining to detect such clusters. Its efficiency and effectiveness have been demonstrated through experiments on several real data sets.

A Fast Subspace Clustering Algorithm Based on Pattern Similarity

Discovering pattern-based subspace clusters by pattern tree

Effective algorithm for maximal pattern-based subspace clustering

A Fast Algorithm of Spatial Clustering Based on Agglomeration

A Statistical Information-Based Clustering Approach in Distance Space

Frequent Patterns-Based Subspace Clustering

Mining Maximal Pattern-Based Subspace Clusters in High Dimensional Space

Clustering Algorithm for Mining Subspace Clusters in Categorical Datasets

Flexible Clustering by Tendency in High Dimensional Space

Subspace Clustering Algorithm Based on k Most Similar Clustering

Nearest neighbor and closed pattern subspace clustering.

Using Cluster Similarity to Detect Natural Cluster Hierarchies

A New Subspace Clustering Algorithm

Learning a Subspace for Clustering Via Pattern Shrinking

Efficient Direct Structured Subspace Clustering

Agglomerative Clustering in Uniform and Proportional Feature Spaces

Spatial Colocation Pattern Discovery Incorporating Fuzzy Theory.

A Subspace Clustering Algorithm for High Dimensional Spatial Data

Clustering by pattern similarity in large data sets

A Subspace Clustering Algorithm for High Dimensional Data Based on Similar Dimension

Spatial Co-Location Pattern Mining Based On Density Peaks Clustering And Fuzzy Theory