k-Means NANI: An Improved Clustering Algorithm for Molecular Dynamics Simulations

Lexin Chen,Daniel R. Roe,Matthew Kochert,Carlos Simmerling,Ramón Alain Miranda-Quintana
DOI: https://doi.org/10.1021/acs.jctc.4c00308
2024-06-23
Journal of Chemical Theory and Computation
Abstract:One of the key challenges of k-means clustering is the seed selection or the initial centroid estimation since the clustering result depends heavily on this choice. Alternatives such as k-means++ have mitigated this limitation by estimating the centroids using an empirical probability distribution. However, with high-dimensional and complex data sets such as those obtained from molecular simulation, k-means++ fails to partition the data in an optimal manner. Furthermore, stochastic elements in...
chemistry, physical,physics, atomic, molecular & chemical
What problem does this paper attempt to address?