It is not uncommon that for large-scale simulations you need a large volume of high-quality (pseudo)random numbers. The relatively short period of the C standard library’s rand function may make it impractical for your purpose; and you might need to resort to stronger generators such as the Linux-provided /dev/random and /dev/urandompseudo-devices. But calling those is a relatively expensive process as /dev/random may stall until its “entropy pool” is repleted.

The /dev/urandom is the unblocked version of /dev/random, but it is also demonstrably less random. Moreover, one has to access either pseudo-devices through the file system, which in certain case introduces a noticeable impact on performance. Using other sources of randomness such as a hardware random number generator may also incur significant delays.

To avoid being stalled, I propose a simple method that will allow you to recycle bits from an expensive pseudo-random source without polling it too often.

The bit recycling algorithm goes like this:

The Random Bit Recycling Algorithm

A few bits are drawn from the expensive random source (it may be /dev/random or a similar (comparatively) slow generator) and are merged with the remaining bits in the bit buffer using a strong hash function. Once the bit buffer is rejuvenated with new, fresh, hashed bits, you can resume its depleting, returning small bunches of bits each time the program calls the random generator.

It is unclear (to me, at least) how strong the hash function has to be in order to produce satisfactory results. Clearly, if the expensive random source is truly a high grade (pseudo)random generator, shuffling the new bits with the remaining bits may be quite enough to yield interesting results. If, on the contrary, the randoms bits are quite weak, you will need a stronger hash functions (maybe à la MD5 to at least inject an apparent randomness into the generator.

It is also unclear what portion of the the buffer should not be depleted. From the figure, you may guess that it’s about 2/3 and 1/3, but it can be quite different. I would guess the weaker the hash function is, the more random bits you have to keep in the buffer. Conversely, if the hash function is very strong, you may be able to keep much fewer bits in the buffer and still provide high quality random sequences.

[…] first post of this series, I discussed a method to generate a random (guaranteed) permutation. On another occasion, I presented a method to recycle expensive random bits. Today, I will discuss something I had to do […]

[…] generating uniform random points in a triangle, on a sphere, the inversion method, even recycling expensive random bits. In this entry, I will use the rejection method to generate random numbers […]

[…] first post of this series, I discussed a method to generate a random (guaranteed) permutation. On another occasion, I presented a method to recycle expensive random bits. Today, I will discuss something I had to do […]

[…] generating uniform random points in a triangle, on a sphere, the inversion method, even recycling expensive random bits. In this entry, I will use the rejection method to generate random numbers […]