Abstract:This paper presents a novel perspective on correlation functions in the clustering analysis of the large-scale structure of the universe. We first recognise that pair counting in bins of radial separation is equivalent to evaluating counts-in-cells (CIC), which can be modelled using a filtered density field with a binning-window function. This insight leads to an in situ expression for the two-point correlation function (2PCF). Essentially, the core idea underlying our method is to introduce a window function to define the binning scheme, enabling pair-counting without binning. This approach develops a concept of generalised 2PCF, which extends beyond conventional discrete pair counting by accommodating non-sharp-edged window functions. To extend this framework to N-point correlation functions (NPCF) using current optimal edge-corrected estimators, we developed a binning scheme independent of the specific parameterisation of polyhedral configurations. In particular, we demonstrate a fast algorithm for the three-point correlation function (3PCF), where triplet counting is accomplished by assigning either a spherical tophat or a Gaussian filter to each vertex of triangles. Additionally, we derive analytical expressions for the 3PCF using a multipole expansion in Legendre polynomials, accounting for filtered field (binning) corrections. Numerical tests using several suites of N-body simulation samples show that our approach aligns remarkably well with the theoretical predictions. Our method provides an exact solution for quantifying binning effects in practical measurements and offers a high-speed algorithm, enabling high-order clustering analysis in extremely large datasets from ongoing and upcoming surveys such as Euclid, LSST, and DESI.

ClusterFinder: a fast tool to find cluster structures from pair distribution function data

Constraint-based Clustering by Fast Search and Find of Density Peaks

Pair Counting without Binning -- A New Approach to Correlation Functions in Clustering Statistics

Algorithm for distance list extraction from pair distribution functions

Fast Clustering Using Adaptive Density Peak Detection

There's no place like real-space: elucidating size-dependent atomic structure of nanomaterials using pair distribution function analysis

Distribution-Based Cluster Structure Selection.

Extracting structural motifs from pair distribution function data of nanostructures using explainable machine learning

Pair distribution function and structure factor of spherical particles

Multivariate Functional Data Clustering Using Adaptive Density Peak Detection.

A functional clustering algorithm for the analysis of dynamic network data

Cluster Analysis of Accelerated Molecular Dynamics Simulations: A Case Study of the Decahedron to Icosahedron Transition in Pt Nanoparticles

FINEX: A Fast Index for Exact & Flexible Density-Based Clustering (Extended Version with Proofs)*

Automatically Selecting Cluster Centers in Clustering by Fast Search and Find of Density Peaks with Data Field

Robust Two-Layer Partition Clustering of Sparse Multivariate Functional Data

Nanostructure determination from the pair distribution function: A parametric study of the INVERT approach

Anomaly Detection Algorithm Based on CFSFDP

Fast estimation of ion-pairing for screening electrolytes: A cluster can approximate a bulk liquid

Sparse clusterability: testing for cluster structure in high dimensions

A New Functional Clustering Method with Combined Dissimilarity Sources and Graphical Interpretation