Abstract:With the rapid development of computing technology, using parallel computing to solve large-scale ranking-and-selection (R&S) problems has emerged as an important research topic. However, direct implementation of traditionally fully sequential procedures in parallel computing environments may encounter various problems. First, the scheme of all-pairwise comparisons, which is commonly used in fully sequential procedures, requires a large amount of computation and significantly slows down the selection process. Second, traditional fully sequential procedures require frequent communication and coordination among processors, which are also not efficient in parallel computing environments. In this paper, we propose three modifications on one classical fully sequential procedure, Paulson's procedure, to speed up its selection process in parallel computing environments. First, we show that if no common random numbers are used, then we can significantly reduce the computation spent on all-pairwise comparisons at each round. Second, by batching different alternatives, we show that we can reduce the communication cost among the processors, leading the procedure to achieve better performance. Third, to boost the procedure's final-stage selection, when the number of surviving alternatives is less than the number of processors, we suggest to sample all surviving alternatives to the maximal number of observations that they should take. We show that, after these modifications, the procedure remains statistically valid and is more efficient compared with existing parallel procedures in the literature. Summary of Contribution: Ranking and selection (R&S) is a branch of simulation optimization, which is an important area of operations research. In recent years, using parallel computing to solve large-scale R&S problems has emerged as an important research topic, and this research topic is naturally situated in the intersection of computing and operations research. In this paper, we consider how to improve a fully sequential R&S procedure, namely, Paulson's procedure, to reduce the high computational complexity of all-pairwise comparisons and the burden of frequent communications and coordination, so that the procedure is more suitable and more efficient in solving large-scale R&S problems using parallel computing environments that are becoming ubiquitous and accessible for ordinary users. The procedure designed in this paper appears more efficient than the ones available in the literature and is capable of solving R&S problems with over a million alternatives in a parallel computing environment with 96 processors. The paper also extended the theory of R&S by showing that the all-pairwise comparisons may be decomposed so that the computational complexity may be reduced significantly, which drastically improves the efficiency of all-pairwise comparisons as observed in numerical experiments.

Dualheap Selection Algorithm: Efficient, Inherently Parallel and Somewhat Mysterious

Distributed Privacy-Aware Fast Selection Algorithm for Large-Scale Data.

Parallel External Selection Algorithm on Distributed Memory Systems

An Optimal External Selection Algorithm and Its Application in the Internet.

A nearly optimal randomized algorithm for explorable heap selection

Parallel Algorithms for Select and Partition with Noisy Comparisons

Parallel Sorting by Approximate Splitting for Multi-core Processors

Fully Sequential Procedures for Large-Scale Ranking-and-Selection Problems in Parallel Computing Environments

Median of heaps: linear-time selection by recursively constructing binary heaps

An efficient sorting algorithm - Ultimate Heapsort(UHS)

A Partitioning Selection Algorithm on Multiprocessors

Selection Improvements on the Parallel Iterative Algorithm for Stable Matching

Coarse Grained Parallel Selection

Best of Both Worlds: Practical and Theoretically Optimal Submodular Maximization in Parallel

Weak heaps engineered

Replicable Parallel Branch and Bound Search

Speeding Up Paulson's Procedure for Large-Scale Problems Using Parallel Computing

Practical Massively Parallel Sorting

High-Performance and Flexible Parallel Algorithms for Semisort and Related Problems

Optimal Data Selection: An Online Distributed View

Dual-Directed Algorithm Design for Efficient Pure Exploration