Abstract:This paper describes the theoretical background, algorithm and validation of a recently developed novel method of ranking based on the sum of ranking differences [TrAC Trends Anal. Chem. 2010; 29: 101–109]. The ranking is intended to compare models, methods, analytical techniques, panel members, etc. and it is entirely general. First, the objects to be ranked are arranged in the rows and the variables (for example model results) in the columns of an input matrix. Then, the results of each model for each object are ranked in the order of increasing magnitude. The difference between the rank of the model results and the rank of the known, reference or standard results is then computed. (If the golden standard ranking is known the rank differences can be completed easily.) In the end, the absolute values of the differences are summed together for all models to be compared. The sum of ranking differences (SRD) arranges the models in a unique and unambiguous way. The closer the SRD value to zero (i.e. the closer the ranking to the golden standard), the better is the model. The proximity of SRD values shows similarity of the models, whereas large variation will imply dissimilarity. Generally, the average can be accepted as the golden standard in the absence of known or reference results, even if bias is also present in the model results in addition to random error. Validation of the SRD method can be carried out by using simulated random numbers for comparison (permutation test). A recursive algorithm calculates the discrete distribution for a small number of objects (n < 14), whereas the normal distribution is used as a reasonable approximation if the number of objects is large. The theoretical distribution is visualized for random numbers and can be used to identify SRD values for models that are far from being random. The ranking and validation procedures are called Sum of Ranking differences (SRD) and Comparison of Ranks by Random Numbers (CRNN), respectively. Copyright © 2010 John Wiley & Sons, Ltd. The theoretical background and algorithm are described for sum of ranking differences (SRD ‐ a novel procedure of ordering, grouping of methods, models). The proximity of SRD values shows similarity of the methods, whereas large variation will imply dissimilarity. Validation of the SRD procedure can be carried out by using simulated random numbers for comparison (a permutation test called CRRN). The theoretical distribution is visualized; probabilities are calculated to SRD values showing whether they are far from being random.

Sum of ranking differences for method discrimination and its validation: comparison of ranks with random numbers

RD-Suite: A Benchmark for Ranking Distillation

Estimate Risk Difference and Number Needed to Treat in Survival Analysis.

Ranking evaluation metrics from a group-theoretic perspective

Ranking with Confidence for Large Scale Comparison Data

Statistical models for assessing agreement for quantitative data with heterogeneous random raters and replicate measurements

Unbiased Comparative Evaluation of Ranking Functions

A Reputation Ranking Method Based on Rating Patterns and Rating Deviation

Ranking of classification algorithms in terms of mean-standard deviation using A-TOPSIS

Effective signal reconstruction from multiple ranked lists via convex optimization

Bias-aware ranking from pairwise comparisons

A novel statistical method for comparing effectiveness of two treatments-simulated randomized controlled trials

Exact p-values for pairwise comparison of Friedman rank sums, with application to comparing classifiers

An Experimental Evaluation of SimRank-based Similarity Search Algorithms

Consistency of ranking was evaluated as new measure for prediction model stability: longitudinal cohort study

Rank-Preference Consistency as the Appropriate Metric for Recommender Systems

Measuring model variability using robust non-parametric testing

A New Weighted Spearman's Footrule as A Measure of Distance between Rankings

Statistical Consistency of Ranking Methods in A Rank-Differentiable Probability Space

R-divergence for Estimating Model-oriented Distribution Discrepancy

Review on ranking and selection: A new perspective