Identifying Quality Mersenne Twister Streams For Parallel Stochastic Simulations

Benjamin Antunes,Claude Mazel,David R.C Hill
2024-01-30
Abstract:The Mersenne Twister (MT) is a pseudo-random number generator (PRNG) widely used in High Performance Computing for parallel stochastic simulations. We aim to assess the quality of common parallelization techniques used to generate large streams of MT pseudo-random numbers. We compare three techniques: sequence splitting, random spacing and MT indexed sequence. The TestU01 Big Crush battery is used to evaluate the quality of 4096 streams for each technique on three different hardware configurations. Surprisingly, all techniques exhibited almost 30% of defects with no technique showing better quality than the others. While all 106 Big Crush tests showed failures, the failure rate was limited to a small number of tests (maximum of 6 tests failed per stream, resulting in over 94% success rate). Thanks to 33 CPU years, high-quality streams identified are given. They can be used for sensitive parallel simulations such as nuclear medicine and precise high-energy physics applications.
Distributed, Parallel, and Cluster Computing
What problem does this paper attempt to address?
The main objective of this paper is to evaluate the impact of different parallelization techniques used in High Performance Computing (HPC) for parallel random simulations on the quality of random number streams generated by the Mersenne Twister (MT) pseudorandom number generator. Specifically, the study compares three parallelization methods: Sequence Splitting, Random Spacing, and MT Indexed Sequence, to determine whether these methods affect the quality of the generated random number streams. The study is conducted through the following steps: 1. **Experimental Design**: The authors selected three different hardware configurations (A, B, and C) and ran the same experiments on each configuration. Each configuration used a different version of the gcc compiler, leading to slight differences between the executable files. The original 32-bit Mersenne Twister (MT) was chosen as the subject of the study because it is widely used and has good performance characteristics. 2. **Generating Random Number States**: To evaluate the different parallelization techniques, the authors generated 4096 MT states using the three methods: Sequence Splitting, Random Spacing, and MT Indexed Sequence. Sequence Splitting required a longer time to generate, while Random Spacing and MT Indexed methods were faster. 3. **Testing the Quality of Random Number Streams**: The random number streams generated for each state were tested using the Big Crush test suite from TestU01. The tests included both integer and real number sequences. The results showed that all techniques had approximately a 30% defect rate, with no technique showing a clear advantage. 4. **Evaluating Repeatability and Reproducibility**: The experiments were repeated on three different hardware configurations to assess the repeatability and reproducibility of the results. Additionally, the recent evolution of the term "reproducibility" was discussed. In summary, this paper aims to evaluate the impact of parallelization techniques on the quality of random number streams generated by the MT pseudorandom number generator and validates the reliability and reproducibility of these techniques through large-scale experiments.