An implementation of Accept-Reject Algorithm¶

by Sheng BI

The accept-rejection algorithm is as follows:

  1. for $t = 1$ to $T$ do
  2.         sample $x \sim \text{Proposal}(\theta)$; sample $u ∼ \text{Unif}(0, 1)$
  3.         $\text{ratio} = \frac{\text{Target}(x)}{c \times \text{Proposal}(x)} $
  4.         if $u ≤ \text{ratio}$ then
  5.                 Accept: $\theta(t) = d$
  6.         else
  7.                 Reject, move to 2
  8.         end if
  9. end for

Let $p(\theta)$ denote the target density for variable $\tilde{\theta}$.

Let $g(\theta)$ denote the proposal density which has the same support as $p(\theta)$.

And $c\times g(\theta)$ must provide an envelope for the target density. Then, given a draw $x$ from the proposal density, we can reject or accept the draw by evaluating how likely this draw is from the target density.

Let $\text{Pr}(\tilde{\theta}\leq\theta)=\int^{\theta}p(\tilde{\theta})d\tilde{\theta}$ (the cumulative distribution function of the $\tilde{\theta})$, we then have

$$ \begin{array}{ccc} \text{Pr}(\tilde{\theta}\leq\theta) & = & \text{Pr}(x\leq\theta|u\leq\frac{p(x)}{c\times g(x)})\\ & = & \frac{\int^{\theta}\int^{\frac{p(x)}{c\times g(x)}}1du\times g(x)dx}{\int\int^{\frac{p(x)}{c\times g(x)}}1du\times g(x)dx}\\ & = & \frac{\int^{\theta}\frac{p(x)}{c\times g(x)}\times g(x)dx}{\int\frac{p(x)}{c\times g(x)}\times g(x)dx}\\ & = & \frac{\int^{\theta}p(x)dx}{\int p(x)dx}\\ & = & \int^{\theta}p(x)dx \end{array} $$

Now we use this Algorithm to sample from a Truncated Normal Distribution. Sampling from a Truncated normal distribution is an essential step in Bayesian Probit Regressions.

In [9]:
library(tidyverse) # metapackage with lots of helpful functions
library(magrittr)
library(ggplot2)
library(gridExtra)
library(truncnorm)

list.files(path = "../input")

In the following example, we implement the algorithm for a truncated normal distribution with trucation point $x=0$.

To begin with, we should find a proposal density which majorizes our target density. Here is some math based on which we regulate the parameters of our proposal density.

Our targe density is standard normal. Recall that the kernel of a standard normal density is

$$ p(\theta)\propto\text{exp}(-\frac{\theta^{2}}{2})\times I(\theta>0)$$

Now consider an exponential proposal density:

$$g(\theta)=\lambda\text{exp}(-\lambda\theta)$$

A more general form of this proposal has the translated exponential kernel:

$$\lambda\text{exp}(-\lambda(\theta-k))=\text{exp}(\lambda k)\times\lambda\text{exp}(-\lambda\theta)$$

where the parameter $k$ gives us freedom to shift the magnitude of the kernel.

Set $k=0$ where $0$ is the truncation point.

We now want to find an envelope for $p(\theta)$. Let us first consider the following ratio:

$$\text{ratio}=\frac{\text{exp}[-\frac{\theta^{2}}{2}+\lambda(\theta-k)]}{\lambda}$$

Given $\alpha$ and $\theta$, this ratio is maximized when $\theta=\lambda$.

Then this ratio becomes:

$$\text{ratio}=\frac{\text{exp}[\frac{\lambda^{2}}{2}-\lambda k)]}{\lambda}$$

Differentiate with respect to $\lambda$ again, we obtain

$$\hat{\lambda}^{2}-\hat{\lambda}k-1=0,$$

which implies that when

$$ \hat{\lambda}=\frac{k\pm\sqrt{k^{2}+4}}{2},$$

this ratio is maximized. Substitue $k=0$ into the solution, we obtain $\hat{\lambda}=1$.

So, we could set $\lambda=1$ which ensures that our ratio is maximized. However, setting $\lambda=1$ is not sufficient to ensure that the ratio is less than 1 everwhere in its support. Adding a scalar $c=2$ will suffice for our purpose.

In [10]:
set.seed(134)

accRej <- function(T = 10000, c_arg = 2, lambda_arg = 1, k_arg = 0) {
    
    target <- function(x) {
        result <- ifelse(x>k_arg, (exp(-(x^2)/2)/sqrt(2*pi))/(1- pnorm(q = k_arg)), 0);
        return(result)
    }
    
    proposal <- function(x, l) {
        result <- ifelse(x>0, l * exp(-l*x), 0);
        return(result)
    }
    
    c <- c_arg * exp(lambda_arg * k_arg);
    
    ratio <- function(x,c,l) {
        result <- ifelse(proposal(x,l) == 0, 0, target(x)/(c * proposal(x,l)))
        return(result)
    }

    theta <- rep(0, T);
    
    for (t in 1:T) {
        
        ## initialize draw for each t, 
        draw <- rexp(n = 1, rate = lambda_arg); 
        r <- ratio(draw, c_arg, lambda_arg);
        
        ## if rejected, draw again; otherwise, accept.
        while (runif(1,0,1) > r) {
            draw <- rexp(n = 1, rate = lambda_arg);
            r    <- ratio(draw, c_arg, lambda_arg); 
        }
        
        theta[t] <- draw
    }
    
    return(theta)
}

mySim <- accRej()

Let us compare our results with the results of rtruncnorm function from the embedded package "truncnorm".

In [11]:
graph.a <- function(d1, d2) {
    df1 <- data.frame(x=d1)
    df2 <- data.frame(x=d2)
    
    df1$whose <- 'My result'
    df2$whose <- 'Truncnormal Package'
    
    df <- rbind(df1,df2)
    p1 <- ggplot(data = df, mapping = aes(x = x, fill = df$whose)) + 
    geom_histogram(alpha = 0.5, aes(y = ..density..), position = 'identity') + labs(fill = '')
    
    p1
}


graph.a(mySim, (rtruncnorm(10000, a=0, b=Inf, mean = 0, sd = 1)))
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Now when the truncation point is $x=1$, the optimal $lambda$ value is $lambdad = 1.62$, according to $\hat{\lambda}=\frac{k\pm\sqrt{k^{2}+4}}{2}$.

In [12]:
mySim <- accRej(T = 10000, c_arg = 2, lambda_arg = 1.62, k_arg = 1)
graph.a(mySim, (rtruncnorm(10000, a=1, b=Inf, mean = 0, sd = 1)))
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Ref¶

  1. https://stackoverflow.com/questions/3541713/how-to-plot-two-histograms-together-in-r
  2. https://rpubs.com/kaz_yos/ggplot2-stat-function
  3. Jackman, Simon. Bayesian analysis for the social sciences. Vol. 846. John Wiley & Sons, 2009.