Generating random numbers using C++ standard library: the problems

self_awareness · on May 18, 2020

Portability of random number generation in C++ is also described in this blog post: http://anadoxin.org/blog/c-shooting-yourself-in-the-foot-4.h...

tpolzer · on May 18, 2020

The generator/seeding design is gnarly, totally agreed.

I do want to disagree with the author on distributions. In most scenarios it is totally sufficient for distributions to be deterministic and reproducible on a given implementation/architecture instead of ossifying them to one implementation mandated by the standard.

ur-whale · on May 18, 2020

>n most scenarios it is totally sufficient for distributions to be deterministic and reproducible on a given implementation/architecture instead of ossifying them to one implementation mandated by the standard.

Distributions are dog slow.

tpolzer · on May 19, 2020

Compared to what? Taking rand()%n? Of course they're usually slower than that, because they provide better quality distribution of random numbers.

ur-whale · on May 23, 2020

>Compared to what?

Look at the "Conclusion" section of this article:

https://www.pcg-random.org/posts/bounded-rands.html

steerablesafe · on May 18, 2020

std::seed_seq and the SeedSeq concept look incredibly broken, also it looks way over-engineered. Why can't a random generator engine just get a Calleable as a constructor argument and call it as many times as it needs to seed itself? SeedSeq tries to be way more than it needs to be while failing its basic requirement.

quietbritishjim · on May 18, 2020

Yes indeed. In Boost.Random (which the standard library random was based on), you can do exactly that:

    boost::mt19937 generator(boost::random_device());

You can also pass an appropriate length container (as two iterators) which are directly used as the seed without any intermediate object.

srg0 · on May 19, 2020

PCG is not the only game in town when it comes to high-quality reproducible and fast PRNGs: https://github.com/lemire/testingRNG#visual-summary Notably, some newer PRNGs are faster, and still pass all statistical tests.

jeffbee · on May 18, 2020

Wouldn't stability of output with given initial conditions totally freeze any ability to change the implementation in the future?

loeg · on May 18, 2020

Sort of. Yes, you can't change the results, and that is often desirable for simulations. Consider academic publication and replication. You could make optimizations that do not affect results.

Jyaif · on May 18, 2020

My anecdote with C++'s RNGS: I was getting different sequences of random values using std::minstd_rand on arm64 and arm32.

Turns out it was because the seed type is uint_fast32_t and I was passing a 64 bit seed. On arm32 the seed was getting truncated to 32 bits, but not on arm64. It was my fault but it was still surprising.

joosters · on May 18, 2020

So you need a large stream of random numbers to seed your random number generator? Seems like a bizarre chicken and egg problem... If I had a source for large quantities of random numbers, why would I even need the second random number generator?

loeg · on May 18, 2020

The use case here is basically simulations, where cryptographic RNG quality is not needed but a lot of pseudo-random numbers are. Even the fastest CSPRNGs (e.g., bulk Chacha20 keystream) are much much slower than good non-cryptographic PRNGs such as PCG or a 128-bit LCG. (Note that Mersenne Twister is an awful PRNG despite its enduring popularity.)

If you play around with Chacha rounds, the minimal number of rounds necessary to provide "good", but non-cryptographic, results is 4[1], and Chacha4 (with dolbeau AVX acceleration) is still substantially slower than PCG or 128-bit LCG with 4 rounds (something like 30-50% slower than pcg64_fast, as I measure it). (Full-round AES w/ AES-NI is about 30% slower than Chacha4 w/ AVX on my hardware, but I did not test a reduced-round AES with AES-NI acceleration; that might be interesting and competitive.)

[1]: https://crypto.stackexchange.com/a/57672/51869

bsder · on May 18, 2020

> Note that Mersenne Twister is an awful PRNG despite its enduring popularity.

References? I'm not being snide. I need to save those for future arguments.

loeg · on May 18, 2020

"Awful" was maybe hyperbole; mea culpa. I should have said: it's not great compared to other options available today. It shouldn't be your first choice anywhere.

In short:

* It has a huge state (~2kB per instance); compare 128-bit LCG at 128 bits or even a cryptographic PRF like Chacha at 64 bytes. This is a factor of 128x larger overhead (or 32x bigger than Chacha!) and hurts, say, L1 cache of more important data, especially with multiple instances.

* It has mediocre statistical properties (fails TestU01 BigCrush, fails PractRand). (128-bit LCG passes both.)

* And it has mediocre performance. (In the PCG benchmark, pg. 35: ~2.17 ns/op vs 128/64 LCG at ~1.05 ns/op. In that diagram, labelled "MCG" — but MCGs are just a subset of LCGs.)

Ref: https://www.pcg-random.org/pdf/hmc-cs-2014-0905.pdf

See also: https://en.wikipedia.org/wiki/Mersenne_Twister#Disadvantages

VHRanger · on May 19, 2020

PCG is incredibly simple as well. I made a 100 line with ~10sloc (most of it is comments).

It too has a 128bit state and has a hard to beat speed for such a naive implementation

https://github.com/VHRanger/pcg.h

loeg · on May 19, 2020

> PCG is incredibly simple as well

Yes and no. There are a lot of different functions in the PCG umbrella and that adds some mental complexity IMO. Some of the functions are very simple, including the 128-bit LCG I am referring to above ("pcg64_fast", IIRC). It is almost trivial; basically a single LoC to generate the next state and a single LoC to produce the output from that state:

https://github.com/imneme/pcg-c/blob/master/include/pcg_vari...

Or condensed from several composed inline functions:

  uint64_t
  my_pcg64_fast(uint128_t *state)
  {
      // Multiplier: (2549297995355413924ULL << 64) + 4865540595714422341ULL
      *state *= PCG_DEFAULT_MULTIPLIER_128;
      // Output is XOR-condensed state rotated right by the high 6 bits of state.
      return pcg_rotr_64(((uint64_t)(*state >> 64u)) ^ (uint64_t)*state,
        *state >> 122u);
  }

gnulinux · on May 18, 2020

Because one of them can be slower than the other? Let's assume random.org gives true quantum randomness. Let's assume it's acceptable for my program to make an api call to random.org to get some random numbers.

That doesn't mean I want to go to internet every time I need a random number. I'd much rather get something like 256 random bits from random.org and create the rest with Mersenne Twister which has an insanely large period.

ChrisSD · on May 18, 2020

If you have a stream of random numbers it can be expanded it to a gigantic stream of random numbers. A good pseudo-random number generator can give you a ZiB of random data from a 128bit or 256bit number.

kevin_thibedeau · on May 18, 2020

You might want the second generator to be deterministic. Consider a fuzzing testcase that generates randomized inputs. When a failure is hit the seed is recorded and someone can reproduce it for debugging.

HelloNurse · on May 18, 2020

Apart from performance reasons, a truly random seed ensures that there is no bias and the computation can be easily rerun with a different seed, but it can also be recorded and used later to regenerate the same vastly larger sequence of pseudorandom numbers (useful, for example, for regression testing of the main computation code).

_wldu · on May 18, 2020

On Unix systems, you can use /dev/urandom to do this.

29athrowaway · on May 18, 2020

By default with gcc and clang on Linux, random_engine will use /dev/urandom.

I use /dev/random and configured it with rng-tools. With rng-tools you can configure the sources of entropy for rngd.

You can add a hw random number as entropy source there. Just do not add more entropy than the size of the entropy pool or your system performance will suffer.

Some people discard a number of generated values to make it harder to guess the internal state of the generator. discard_block_engine can be used for this too.

Anyways, if you want to test your random numbers try https://webhome.phy.duke.edu/~rgb/General/dieharder.php

ur-whale · on May 18, 2020

One thing that isn't mentioned in the article is performance.

C++ standard PRNG's, on top of all the brokenness listed in the article are very inefficient, especially if you use distributions.

gnulinux · on May 19, 2020

I was curious why that'd be the case. I checked GCC's implementation. It seems to be here: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-...

Looks like they had to make this very generic, generic enough to account for random number generators that return value in specific ranges. If you have a generator that guarantees to return a uniformly random 32 bit value, it seems much faster to do (r%(max-min)+min) instead.

brandmeyer · on May 20, 2020

Taking the modulus of the desired range isn't uniform.

gnulinux · on May 20, 2020

That's true, and obviously so but I never realized it until now. I'm curious why r%(max-min)+min is widespread "common" way of doing this, maybe I just saw it in a few places and never questioned it.

eska · on May 18, 2020

Somewhere in the footnotes he mentions that Microsoft considers changing the implementation of its uniform int distribution for performance reasons (as a supporting argument for lack of portability).

mark-r · on May 18, 2020

You can't even count on std::random_device doing something sane. https://stackoverflow.com/q/18880654/5987

loeg · on May 18, 2020

Yes, this is one of the first things covered in the article.

mark-r · on May 18, 2020

They spend a lot of time telling you std::random_device::entropy() is useless, but I don't see anywhere that it says std::random_device itself is bad. The point of my link was to show that random_device itself is unreliable, even if you use it correctly. It doesn't always work as advertised.

loeg · on May 19, 2020

I read this part as describing that flaw:

> The problem is that std::random_device is poorly specified, and inscrutable.

> In theory, it should serve as an abstraction over some external source of entropy. In practice, an implementation is allowed to use any deterministic random number engine to implement it, e.g. a Mersenne twister, and there is no way to find out. There is a member function std::random_device::entropy(), which is in theory there to detect such case, but it does not work in practice.

They go on to discuss ::entropy()'s flaws after that.

otabdeveloper4 · on May 18, 2020

[flagged]

saagarjha · on May 18, 2020

What makes you so sure that this is not something that belongs in the C++ standard?

aelphssss · on May 18, 2020

I wrote a 64 bit random number generator in C. I didn't think much of it, it's only pseudo random.

Asooka · on May 18, 2020

> every single part of it is deeply flawed, and the best solution is to avoid using it completely

You can sadly say the same for a large part of C++ standard features. At least the core language is mostly sound, simple intuitive and performant, so having to file off some edges is not that big of a problem.

MaxBarraclough · on May 18, 2020

> the core language is mostly sound, simple intuitive and performant

I wouldn't say so. Even object initialization is atrociously complex in C++.

https://stackoverflow.com/a/620402/

https://stackoverflow.com/a/54350350/

quietbritishjim · on May 18, 2020

Or move semantics including rvalue references. In retrospect, objects should have been moved by default (and destructors for moved-from objects not called), like Rust. Then copying can just be an explicit normal method (that returns its result by moving it!).

Veedrac · on May 18, 2020

> and destructors for moved-from objects not called

This doesn't work entirely cleanly in cases the compiler can't track liveness, which is more of an issue for C++ than Rust, since you want to be able to move out of pointers. A simple solution is to replace move operators with an `invalidate` function the compiler calls after doing a shallow move, that sets the old value to a moved-from state.

Copy constructors (as opposed to copy assignment operators) are already a fine way to copy a value, there's no real need to change that.

Asooka · on May 18, 2020

Object initialisation complexity is a consequence of the design goals of C++. It's unavoidable complexity, but it's not surprising once you know about it, and it is required for consistency.

MaxBarraclough · on May 18, 2020

> it's not surprising once you know about it

I disagree. My first link shows subtle but important differences across versions of C++, and it is surprising. It's a foot-gun at the heart of the language. I agree with the author's comment: This is one of the dusty corners of C++ that can drive you crazy.

leni536 · on May 18, 2020

It's also a consequence of some unfortunate mistakes, like uniform initialization being screwed up by std::initializer_list.

tsimionescu · on May 18, 2020

C++ is pretty performant, that much is true. But no aspect of C++ is simple, almost all C++ features have complex interactions with almost all other C++ features (I can back this up with examples, if you'd like). And there's always the specter of UB making your benign program have no semantics. There is no other language as complex as C++.

Still, for its relatively larhe niche, there is nothing that beats well-written C++. Thankfully, that niche has been ever-shrinking, though very slowly. Perhaps Rust will stake a claim to another part of C++'s niche, like Java/C# did so successfully before.