Random Variable Generation: Probability Integral Transform

Suppose we would like to generate \(X\sim f\), where \(f\) is the probability density function (pdf) of \(X\). If the corresponding cumulative distribution function (cdf) has a generalized inverse, then we can use the Probability Integral Transform. The only other requirement is that we have the ability to simulate \(U\sim Unif[0,1]\).




This example is with an arbitrary, un-named distribution (i.e. one for which pre-packaged routines are unlikely to exist).




Shiny app by Peter Chi
Base R code by Peter Chi
Shiny source files: GitHub Gist

Demonstration with \(X \sim Exp(\lambda)\):


Details:

1) First we note that as \(f(x)=\lambda e^{-\lambda x}\) for an Exponential random variable, the cdf is thus \(F(x) = 1-e^{-\lambda x}\), for \(x \geq 0\)

2) Next, the inverse of this is \(F^{-1}(u) = \frac{-ln(1-u)}{\lambda}\)

3) Thus, we generate \(U\sim Unif[0,1]\), and plug these values into \(F^{-1}\) to obtain generations of \(X\)

Demonstration with \(f(x) = \frac{x}{8} \cdot 1_{\{0 \leq x \leq 4\}}\)


Details:

1) First we note that the cdf is \(F(x) = \frac{x^2}{16}\), for \(0 \leq x \leq 4\)

2) Next, the inverse of this is \(F^{-1}(u) = 4 \sqrt{u}\)

3) Thus, we generate \(U\sim Unif[0,1]\), and plug these values into \(F^{-1}\) to obtain generations of \(X\)

Random Variable Generation: Accept-Reject Algorithm

Suppose we would like to generate \(X\sim f\) but can neither do it directly nor via the Probability Integral Transform (e.g. if the generalized inverse of the cdf is unavailable). We can instead arrive at it via generating \(Y\sim g\), with only the following two necessary conditions:

1) \(f\) and \(g\) have the same support

2) We can find a constant \(M\) such that \(f(x)/g(x) \leq M \) for all \(x\)







This example is with the standard normal distribution, truncated at 2 (i.e. allowing for values greater than or equal to 2 only).
It would indeed be possible to simulate a normal random variable truncated at 2 by using a pre-packaged routine for a standard normal random variable (such as rnorm in R), and then discarding all values below 2.
However, this would be extremely inefficient as we would discard more than 97% of all values that we generate. With the accept-reject algorithm, we of course also discard values, but here we demonstrate that this method has superior efficiency, in the sense that less than 97% of the generated values will be discarded.



Shiny app by Peter Chi
Base R code by Peter Chi
Shiny source files: GitHub Gist

Demonstration with \(X \sim Beta(\alpha,\beta)\):


In the right panel: after initially being shown in red, rejected points remain in grey and stack down from the top; after initially being shown in green, accepted points remain in black and stack up from the bottom, to fill the shape of the theoretical pdf of \(X\).


Details:

1) The first step is to generate \(Y \sim g\). In this example, we use \(Y \sim Unif[0,1]\) (shown in the right panel, along with the true distribution that we are trying to simulate from). Notice that the Unif[0,1] distribution does indeed have the same support as the Beta distribution.

2) Next, we need to find an appropriate value of M. For the Beta example, we notice that the maximum of the Beta pdf would work.


3) We also generate \(U \sim Unif[0,1]\) (left panel), and then accept \(y\) as a value of \(X\) if \(U \leq \frac{f(y)}{Mg(y)}\), and reject otherwise

Demonstration with Truncated Normal


In the right panel: after initially being shown in red, rejected points remain in grey and stack down from the top; after initially being shown in green, accepted points remain in black and stack up from the bottom, to fill the shape of the theoretical pdf of \(X\).


Details:

1) The first step is to generate \(Y \sim g\). In this example, we will use \(g(y) = e^{2-y} \cdot 1_{\{y \geq 2\}} \) (shown in the right panel, along with the true distribution that we are trying to simulate from).

2) Next, we need to find an appropriate value of M. For this example, we notice the following:

$$\frac{f(y)}{g(y)} = \frac{\frac{1}{\sqrt{2 \pi}}e^{-\frac{1}{2}y^2}1_{\{y \geq 2\}} \cdot \left[\frac{1}{1-\Phi(2)}\right] }{e^{2-y}1_{\{y \geq 2\}}}$$

where \(\Phi\) is the standard normal cdf. It can be shown that this ratio is at its maximum at \(y=2\). Thus, \(M=\frac{\phi(2)}{1-\Phi(2)}\) where \(\phi\) is the standard normal pdf.

3) We also generate \(U \sim Unif[0,1]\) (left panel), and then accept \(y\) as a value of \(X\) if \(U \leq \frac{f(y)}{Mg(y)}\), and reject otherwise