Randomized Kaczmarz for Rank Aggregation from Pairwise Comparisons

Vivek S. Borkar,Nikhil Karamchandani,Sharad Mirani
DOI: https://doi.org/10.48550/arXiv.1605.02470
2016-05-09
Abstract:We revisit the problem of inferring the overall ranking among entities in the framework of Bradley-Terry-Luce (BTL) model, based on available empirical data on pairwise preferences. By a simple transformation, we can cast the problem as that of solving a noisy linear system, for which a ready algorithm is available in the form of the randomized Kaczmarz method. This scheme is provably convergent, has excellent empirical performance, and is amenable to on-line, distributed and asynchronous variants. Convergence, convergence rate, and error analysis of the proposed algorithm are presented and several numerical experiments are conducted whose results validate our theoretical findings.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to infer the overall ranking among entities based on pairwise comparison data. Specifically, within the framework of the Bradley - Terry - Luce (BTL) model, the author uses the available empirical data (pairwise preferences) to infer the overall ranking of entities. ### Problem Background In many application scenarios, such as web search, recommendation systems, competitive sports, and online game systems, it is necessary to obtain a "consensus" ranking from multiple partial preferences to best describe the available data. Such problems are called **Rank Aggregation**. In particular, when the data is in the form of pairwise comparisons, for example, which of two items a user prefers, or the result of a match between two chess players, the inherent "quality" or "score" of the items can be estimated through the results of these pairwise comparisons. ### Research Methods The author uses the Bradley - Terry - Luce (BTL) model to model pairwise comparisons and transforms the problem into solving a noisy linear equation system through a simple transformation. The specific steps are as follows: 1. **Model Assumption**: Each entity \(i\) has a weight \(w_i>0\), representing its inherent quality or score. For any pair of entities \(i\) and \(j\), the probability that \(i\) is selected is: \[ p_{ij}=\frac{w_i}{w_i + w_j} \] 2. **Data Generation**: By comparing each pair of entities multiple times, their preference results \(X_{ij}^k\) can be obtained, where \(X_{ij}^k = 1\) indicates that \(i\) wins in the \(k\)-th comparison, and 0 otherwise. 3. **Problem Transformation**: By calculating the preference frequency \(\hat{p}_{ij}\) of each entity pair, it can be transformed into a logarithmic form: \[ y'_{ij}=-\log\left(\frac{1}{\hat{p}_{ij}} - 1\right) \] and further transform the problem into solving a noisy linear equation system \(y' = L^T v\), where \(L\) is the incidence matrix of the graph and \(v\) is the logarithmic weight vector. ### Solution To solve this noisy linear equation system, the author adopts the randomized Kaczmarz algorithm. This algorithm has the following advantages: - **Convergence**: The algorithm is proven to be convergent and has good practical performance. - **Adaptability**: It is suitable for online, distributed, and asynchronous variants. ### Main Contributions 1. **Theoretical Analysis**: The author provides the convergence, convergence rate, and error analysis of the algorithm. In particular, in the case of Erdős - Rényi graphs, the number of pairwise comparisons required is order - optimal. 2. **Experimental Verification**: The effectiveness of the theoretical results is verified through extensive numerical experiments. In summary, this paper aims to solve the rank aggregation problem based on pairwise comparison data through the randomized Kaczmarz algorithm and theoretically and experimentally proves the effectiveness and superiority of this method.