Abstract:The Bradley-Terry-Luce (BTL) model is one of the most widely used models for ranking a collection of items or agents based on pairwise comparisons among them. Given $n$ agents, the BTL model endows each agent $i$ with a latent skill score $\alpha_i > 0$ and posits that the probability that agent $i$ is preferred over agent $j$ is $\alpha_i/(\alpha_i + \alpha_j)$. In this work, our objective is to formulate a hypothesis test that determines whether a given pairwise comparison dataset, with $k$ comparisons per pair of agents, originates from an underlying BTL model. We formalize this testing problem in the minimax sense and define the critical threshold of the problem. We then establish upper bounds on the critical threshold for general induced observation graphs (satisfying mild assumptions) and develop lower bounds for complete induced graphs. Our bounds demonstrate that for complete induced graphs, the critical threshold scales as $\Theta((nk)^{-1/2})$ in a minimax sense. In particular, our test statistic for the upper bounds is based on a new approximation we derive for the separation distance between general pairwise comparison models and the class of BTL models. To further assess the performance of our statistical test, we prove upper bounds on the type I and type II probabilities of error. Much of our analysis is conducted within the context of a fixed observation graph structure, where the graph possesses certain ``nice'' properties, such as expansion and bounded principal ratio. Additionally, we derive several auxiliary results, such as bounds on principal ratios of graphs, $\ell^2$-bounds on BTL parameter estimation under model mismatch, stability of rankings under the BTL model, etc. We validate our theoretical results through experiments on synthetic and real-world datasets and propose a data-driven permutation testing approach to determine test thresholds.

Experimental Design under the Bradley-Terry Model

Bayesian Optimization Based on Pseudo Labels

On Extending the Bradley-Terry Model to Accommodate Ties in Paired Comparison Experiments

The many routes to the ubiquitous Bradley-Terry model

Optimal design of experiments to identify latent behavioral types

Ties in Paired-Comparison Experiments: A Generalization of the Bradley-Terry Model

Experimental Design For Causal Inference Through An Optimization Lens

Synthetic Principal Component Design: Fast Covariate Balancing with Synthetic Controls

Near-Optimal Experimental Design under the Budget Constraint in Online Platforms.

Accelerating Experimental Design by Incorporating Experimenter Hunches

Better Experimental Design by Hybridizing Binary Matching with Imbalance Optimization

Synthetic Design: An Optimization Approach to Experimental Design with Synthetic Controls

Minimax Hypothesis Testing for the Bradley-Terry-Luce Model

Graph-Based Bayesian Optimization for Large-Scale Objective-Based Experimental Design

Enhanced Bayesian Optimization via Preferential Modeling of Abstract Properties

Data-Driven Switchback Experiments: Theoretical Tradeoffs and Empirical Bayes Designs

Multivariate Tie-breaker Designs

Pigeonhole Design: Balancing Sequential Experiments from an Online Matching Perspective

Optimal Adaptive Experimental Design for Estimating Treatment Effect

Bayesian optimization with adaptive surrogate models for automated experimental design

Optimal experimental design: Formulations and computations