Significance Test of Clustering under High Dimensional Setting with Applications to Cancer Data

Ping Dong,Lu Lin,Yunquan Song
DOI: https://doi.org/10.1080/00949655.2018.1518448
IF: 1.225
2018-01-01
Journal of Statistical Computation and Simulation
Abstract:For high dimensional data, the SigClust is developed for testing the significance of clustering. The cluster index (CI) for SigClust is conducted by the ratio of the within-cluster and total sum of squares. But its empirical size is too conservative to be over controlled. By removing the cumbrous terms in the CI, an improved index (BCI) is proposed in this paper. The coefficient of variation of the BCI can be significantly reduced, implying that the new index BCI is stable. Moreover, the new significance test (NewSig) maintains the size, meanwhile, provides a greater power. Simulation experiments and two real cancer data examples are analysed for illustrating the performance of the new methodology.
What problem does this paper attempt to address?