In the first post of this series, I discussed how to generate permutations of sequences using the Fisher-Yates method and I explained (although indirectly) how a linear congruential generator works. In a second post, I explained how to generate 2D points uniformly and randomly distributed a triangle, discussing the method of rejection. In a third post, I’ve discussed how to generate points on a sphere.

All these methods have something in common: they are based on the uniform (pseudo)random generator, and they map uniform numbers onto a shape (or move numbers around, in the first case). What if we need another density function than uniform?

We discussed the rejection method here, but in a somewhat cursory fashion. It’s worth discussing (I may in a future post) but its iterative nature is not always suitable, and it’s finicky to get right. No, in this post I’m going to discuss the inversion method that may be easier to get right when the density function considered is playing nice. Let met explain.

Let be a density function that gives the value for for a random variable . It is a function such that for all (for some range , which may or may not be the reals ) and such that

The cumulative density function, (that is a notation taken by many author where the lowercase letter is the density and the uppercase letter the cumulative density), which is the function yielding , is given by:

where is the lower bound (or infimum, or , whatever delimits the range on the left). The function is monotone non-decreasing, and it may be invertible, that is, may exist and return . If it is invertible, then we can use it to generate a random number drawn with density .

The key observation is that for all , and that, therefore, using will allow us to access the whole range through . Drawing uniformly will make distributed as !

OK, maybe an illustration will make that clear(er):

On the graph, we have and , which are our density and cumulative density functions, respectively. (let us say that indeed the curve is the integral of the curve .) We see that three uniformly spaced numbers on the axis map, through to very different places on the axis and that the spaces between the points on the are spaced non-uniformly. The space between and is such that !

Let us now see how it would work with a real density function, say, the exponential distribution (the example in [2]).

We have

,

for a distribution parameter (the “rate parameter”, related to the expectation). The cumulative distribution function is

.

We compute the inverse:

We can now feed a function with a uniform generator on .

*

* *

In *Numerical Recipes* [2], we find a rather lengthy, but paradoxically terse, discussion on all kind of generators. In [2], the discussion is minimalist, but outlines a few methods correctly. For some distributions, there are better methods than inverses. Especially when the inverse is analytically troublesome. For example, the ziggurat algorithm, a rejection-based algorithm applicable to monotone-decreasing densities (possibly dealing with symmetries), seems to be a runner-up against more traditional methods.

[1] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flanner — *Numerical Recipes in C: The Art of Scientific Computing* — 3rd ed, Cambridge University Press, 2007

[2] S. M. Ross — *Introduction to Probability and Statistics for Engineers and Scientists* — 3rd ed, Academic Press, 2004

[…] I’ve discussed permutations, generating uniform random points in a triangle, on a sphere, the inversion method, even recycling expensive random bits. In this entry, I will use the rejection method to generate […]

I really like your plot, but for clarity, the CDF is generally described as the cumulative distribution function rather than density, which is reserved for the PDF.