LXM: Better Splittable Pseudorandom Number Generators (and Almost as Fast)
LXM: Better Splittable Pseudorandom Number Generators (and Almost as Fast)
22 October 2021
Paper to be submitted to ACM OOPSLA 2021. Abstract: In 2014, Steele, Lea, and Flood presented {\sc SplitMix}, an object-oriented pseudorandom number generator (PRNG) that is quite fast (9 64-bit arithmetic/logical operations per 64 bits generated) and also {\it splittable}. A conventional PRNG object provides a {\it generate} method that returns one pseudorandom value and updates the state of the PRNG; a splittable PRNG object also has a second operation, {\it split}, that replaces the original PRNG object with two (seemingly) independent PRNG objects, by creating and returning a new such object and updating the state of the original object. Splittable PRNG objects make it easy to organize the use of pseudorandom numbers in multithreaded programs structured using fork-join parallelism. This overall strategy still appears to be sound, but the specific arithmetic calculation used for {\it generate} in the {\sc SplitMix} algorithm has some detectable weaknesses, and the period of any one generator is limited to $2^{64}$. Here we present the LXM \emph{family} of PRNG algorithms. The idea is an old one: combine the outputs of two independent PRNG algorithms, then (optionally) feed the result to a mixing function. An LXM algorithm uses a linear congruential subgenerator and an $\mathbf{F}_2$-linear subgenerator; the examples studied in this paper use an LCG of period $2^{16}$, $2^{32}$, $2^{64}$, or $2^{128}$ with one of the multipliers recommended by L'Ecuyer or by Steele and Vigna, and an $\mathbf{F}_2$-linear generator of the \texttt{xoshiro} family or \texttt{xoroshiro} family as described by Blackman and Vigna. Mixing functions studied in this paper include the MurmurHash3 finalizer function, David Stafford's variants, Doug Lea's variants, and the null (identity) mixing function. Like {\sc SplitMix}, LXM provides both a \emph{generate} operation and a \emph{split} operation. Also like {\sc SplitMix}, LXM requires no locking or other synchronization (other than the usual memory fence after instance initialization), and is suitable for use with {\sc simd} instruction sets because it has no branches or loops. We analyze the period and equidistribution properties of LXM generators, and present the results of thorough testing of specific members of this family, using the TestU01 and PractRand test suites, not only on single instances of the algorithm but also for collections of instances, used in parallel, ranging in size from $2$ to $2^{27}$. Single instances of LXM that include a strong mixing function appear to have no major weaknesses, and LXM is significantly more robust than {\sc SplitMix} against accidental correlation in a multithreaded setting. We believe that LXM is suitable for the same sorts of applications as {\sc SplitMix}, that is, ``everyday'' scientific and machine-learning applications (but not cryptographic applications), especially when concurrent threads or distributed processes are involved.
Venue : ACM OOPSLA 2021
File Name : lxm-paper.pdf