A Short Note on Spearman Correlation: Impact of Tied Observations

Yang Liu
DOI: https://doi.org/10.2139/ssrn.2933193
2017-01-01
SSRN Electronic Journal
Abstract:The Spearman Correlation is a well known approach to assess the rank correlation of two data sets. One of the advantages choosing Spearman over other correlation coefficients such as the Pearson is that the difference in original value series is less important while the relative rank of the value is what matters most in this coefficient. The Spearman correlation coefficient is often used to assess and validate the performance of models that require less accuracy in absolute value estimate, e.g. the loss prediction models or exposure models. Although other measures such as the Kendall’s τ and Somer’s D are used to measure rank ordering with tied observations, the Spearman’s ρ is often calculated as an initial step of correlation analysis. In this short note we look into the tied observations in the target data set and investigate the impact on the Spearman Correlation coefficient in three different scenarios: single value ties, random multi-value ties, and bounded random ties.
What problem does this paper attempt to address?