Generalized Statistical Tests for mRNA and Protein Subcellular Spatial Patterning against Complete Spatial Randomness

Jonathan H. Warrell,Anca F. Savulescu,Robyn Brackin,Musa M. Mhlanga
DOI: https://doi.org/10.48550/arXiv.1602.06429
IF: 5.414
2016-02-20
Machine Learning
Abstract:We derive generalized estimators for a number of spatial statistics that have been used in the analysis of spatially resolved omics data, such as Ripley's K, H and L functions, clustering index, and degree of clustering, which allow these statistics to be calculated on data modelled by arbitrary random measures (RMs). Our estimators generalize those typically used to calculate these statistics on point process data, allowing them to be calculated on RMs which assign continuous values to spatial regions, for instance to model protein intensity. The clustering index (H*) compares Ripley's H function calculated empirically to its distribution under complete spatial randomness (CSR), leading us to consider CSR null hypotheses for RMs which are not point-processes when generalizing this statistic. We thus consider restricted classes of completely random measures which can be simulated directly (Gamma processes and Marked Poisson Processes), as well as the general class of all CSR RMs, for which we derive an exact permutation-based H* estimator. We establish several properties of the estimators, including bounds on the accuracy of our general Ripley K estimator, its relationship to a previous estimator for the cross-correlation measure, and the relationship of our generalized H* estimator to previous statistics. To test the ability of our approach to identify spatial patterning, we use Fluorescent In Situ Hybridization (FISH) and Immunofluorescence (IF) data to probe for mRNA and protein subcellular localization patterns respectively in polarizing mouse fibroblasts on micropattened cells. We observe correlated patterns of clustering over time for corresponding mRNAs and proteins, suggesting a deterministic effect of mRNA localization on protein localization for several pairs tested, including one case in which spatial patterning at the mRNA level has not been previously demonstrated.
What problem does this paper attempt to address?