Methods: R Software for Bayesian Analysis

Publish date: 2022-03-12

Tags: BUGS, OpenBUGS, R2OpenBUGS, stan, rstan, nimble, greta, Poisson regression, parallel computation, furrr

Introduction

Probabilistic Programming Languages (PPLs) provide a way of specifying a Bayesian model and a choice of samplers that can be used to simulate parameter values from the model’s joint posterior. The Wikipedia page on Probabilitic Programming (https://en.wikipedia.org/wiki/Probabilistic_programming) lists about 40 such languages, of which a handful can be run in, or via, R.

As the name suggests, BUGS (Bayesian Inference using Gibbs Sampling) is a PPL that uses Gibbs sampling. It was one of the first PPLs and its language for specifying the model has been widely copied and adapted. The BUGS project dates from 1989 when Classic BUGS was released. This was followed by WinBUGS, OpenBUGS and most recently MultiBUGS. The homepage for the project is https://www.mrc-bsu.cam.ac.uk/software/bugs/.

The BUGS language has been adopted by JAGS (Just Another Gibbs Sampler) and by an R-based project called nimble. In fact, for speed, nimble translates the R code into C++ and compiles it before it is run, but to the user, the appearance is of working with R code. nimble provides a wider range of samplers than BUGS and gives the user finer control over those samplers. It also makes it easy for the user to create their own samplers, or to define their own distributions.

Stan (https://mc-stan.org/) differs from BUGS in that it uses Hamiltonian Monte Carlo (HMC) rather than Gibbs Sampling. HMC is a derivative-based Markov chain Monte Carlo (MCMC) algorithm; calculation of the partial derivatives of the log-posterior increases the time needed for each iteration, but the resulting samples usually have lower autocorrelation, so far fewer iterations are needed compared with Gibbs Sampling. The net result is that HMC is often very efficient.

The Stan language is similar to the BUGS language, but it has more built-in functions and in requires slightly more structuring of the code. For increased speed, the user’s model is translated into C++ and compiled before it is run.

HMC requires exact derivatives, which for complex models can be difficult to obtain. The Tensorflow project (https://www.tensorflow.org/) provides scalable open source tools for Machine Learning that include automatic differentiation. The greta project (https://greta-stats.org/) has used the tensorflow library to create an HMC based PPL that, like nimble, can be programmed directly in R.

nimble and greta are fully integrated into R and can be run by downloading the appropriate packages. Stan, BUGS and JAGS are standalone programs that can be run from within R by using interface packages. rstan is the most integrated into R. The various flavours of BUGS require the user to download the free external software before interfacing with it from within R. These interface packages include rjags, R2WinBUGS, R2OpenBUGS and R2MultiBUGS. There is, however, a degree of duplication with other user groups creating alternative interface packages, such as RBugs and runjags.

As well as these PPLs, R also has a range of packages for fitting very specific Bayesian models, but in this post my interest is purely in the mechanics of using BUGS, Stan, nimble and greta; these are the PPLs that I will rely on when I re-analyse the Sliced datasets.

A full list of R packages for Bayesian analysis is available on the CRAN task view (https://cran.r-project.org/web/views/Bayesian.html).

Overview of this Post

In this post, I will take a simple Bayesian Poisson regression model and fit it using flavours of BUGS, nimble, Stan and greta. The emphasis will be on the code rather than the model, so I will not try to interpret the output or even ensure that the algorithms have converged. There is a companion methods post in which I look at ways of interpreting the MCMC output; these methods are essentially the same regardless of the software used to fit the model.

The data

I have chosen to model ONS (Office of National Statistics) data on deaths due to alcohol in the UK. The data can be downloaded as an Excel workbook from https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/causesofdeath/datasets/alcoholspecificdeathsintheukmaindataset.

I will analyse data from table 2 of the Excel workbook that gives numbers of deaths and death rates by age, gender and year. The code that reads the data and prepares it for analysis is given in the appendix; it uses standard R functions and is not crucial to my main task. That code saves the clean data frame in a file called alc.rds.

library(tidyverse)

# --- paths on my desktop --------------------------------------
home     <- "C:/Projects/kaggle/sliced/methods/methods_bayes_software"
filename <- "data/rData/alc.rds"

# --- read the clean data that is to be analysed ---------------
alcDF <- readRDS( file.path(home, filename))

Here is a traditional likelihood analysis of a Poisson regression model that incorporates a linear increase in the rate per 100,000 over time plus categorical age and gender effects, but no interactions.

library(broom)

# --- Poisson regression in glm() -----------------------------------
glm(deaths ~ year + age + gender, 
    offset = log(pop), 
    family = "poisson", 
    data   = alcDF %>% mutate(year = year - 2001)) %>%
  tidy()

## # A tibble: 16 x 5
##    term        estimate std.error statistic   p.value
##    <chr>          <dbl>     <dbl>     <dbl>     <dbl>
##  1 (Intercept)  -1.68    0.0586       -28.7 4.82e-181
##  2 year          0.0143  0.000465      30.7 4.63e-207
##  3 age25-29      1.50    0.0646        23.2 7.12e-119
##  4 age30-34      2.53    0.0606        41.7 0        
##  5 age35-39      3.25    0.0594        54.7 0        
##  6 age40-44      3.81    0.0590        64.5 0        
##  7 age45-49      4.18    0.0588        71.1 0        
##  8 age50-54      4.37    0.0587        74.4 0        
##  9 age55-59      4.44    0.0587        75.6 0        
## 10 age60-64      4.42    0.0588        75.2 0        
## 11 age65-69      4.26    0.0589        72.3 0        
## 12 age70-74      3.90    0.0593        65.7 0        
## 13 age75-79      3.53    0.0602        58.7 0        
## 14 age80-84      3.06    0.0624        49.0 0        
## 15 age85+        2.41    0.0671        35.9 1.60e-281
## 16 gendermale    0.778   0.00571      136.  0

The baseline category (intercept) is women aged 20-24 in 2001.

According to the model, the rate per 100,000 in the baseline group is exp(-1.68)=0.19. The downloaded excel file does not give the measured rate for this group as there was only 1 death in women aged 20-24 in 2001 and the ONS consider the rate unreliable, but in 2002 women aged 20-24 had a rate of 0.2 per 100,000 very much in line with the intercept.

The model suggests that the rate of alcohol deaths is increasing over time, the rate is higher in men than in women and it is highest in middle aged people. Everything is off-the-scale significant.

Of course, there is still much to do, even in a non-Bayesian analysis. I ought to look at model fit, check for over-dispersion, question the linearity over time and look for interactions. However, in this post, all I want to do is reproduce this analysis using Bayesian software. In my companion post, I will look more closely at the inspection of Bayesian results.

BUGS Code

The best way to learn the BUGS language is to copy and adapt other people’s code and in that spirit, the BUGS project provides a lot of examples. They can be accessed, either from the help facilities of the specific implementations of BUGS, or via the BUGS book that is online at https://www.mrc-bsu.cam.ac.uk/software/bugs/the-bugs-project-the-bugs-book/bugs-book-examples/.

The OpenBUGS manual can be downloaded as a pdf from https://www.openbugs.net/w/Manuals and provides a useful reference to the language and the available functions and distributions. Other flavours of BUGS may have slightly more, or slightly fewer functions, but the language remains much the same.

The BUGS language provides a way of specifying the structure of the model together with the priors. Calculations are denoted by <-, as they are in R, and distributions are denoted by ~. Vector processing is not provided and so BUGS code tends to include a lot of loops, which use the same structure as in R. Here is the BUGS code for the likelihood part of the Poisson regression model.

for( i in 1:560 ) {
  log(mu[i]) <- b0 + b1*year[i] + b2[age[i]] + b3*gender[i] + offset[i]
  deaths[i] ~ dpois(mu[i])
}

This code assumes that deaths, year, age, gender and offset are provided as data. It is usually best to perform any preliminary data manipulation in R before sending the data to BUGS. Notice how BUGS allows the log function to be placed on the left hand side of the calculation.

To complete the code, we must specify our priors for b0, b1, the vector b2, and b3. For illustration, I use very vague priors.

b0 ~ dnorm(0, 0.0001)
b1 ~ dnorm(0, 0.0001)
b2[1] <- 0
for(j in 2:14) {
  b2[j] ~ dnorm(0, 0.0001)
}
b3 ~ dnorm(0, 0.0001)

Age is a factor with 14 levels, which I parameterise in the usual glm() style by setting the coefficient of the first level to zero.

In BUGS, the normal distribution is parameterised by the mean and the precision (1/variance). A precision of 0.0001 is equivalent to a variance of 10000 and a standard deviation of 100, so the prior on the intercept says that I believe with 95% probability that it is a value in the range -200 to 200. Since the intercept represents the log rate per 100,000 people in the baseline group, this is like saying that I expect between exp(-200) and exp(200) cases per 100,000 in 2001 amongst women aged 20-24. With a little thought I could certainly narrow done this range. Perhaps I could look up the rates in earlier years, or the rates in other developed countries, or just use common sense; the rate is unlikely to be less that 0.01 (1 in 10million) and it is unlikely to be over 100 (1 in 1,000). This would give a range for the log rate of -4.5 to +4.5, so a zero-centred normal distribution with a standard deviation of 2 (variance 4, precision 0.25) would not be unreasonable.

Specifying a good prior is an important, but neglected, part of a Bayesian analysis, but my aim here is to concentrate on the PPLs, so I’ll not pursue this. Here is my full BUGS model code.

model {
  for( i in 1:560 ) {
    mu[i] <- b0 + b1*year[i] + b2[age[i]] + b3*gender[i] + offset[i]
    deaths[i] ~ dpois(mu[i])
  }
  b0 ~ dnorm(0, 0.0001)
  b1 ~ dnorm(0, 0.0001)
  b2[1] <- 0.0
  for(j in 2:14) {
    b2[j] ~ dnorm(0, 0.0001)
  }
  b3 ~ dnorm(0, 0.0001)
}

BUGS is implemented as an external program and to run an analysis the essential components, that is, the model code, data, initial values and run specification (number of iterations etc) are each created in R then written to text files in a format that can be read by the external program. Most of this is done automatically by the interface package.

For this example, I use OpenBUGS together with the R interface package R2OpenBUGS.

To get the model code into a file, one could type it directly to a text editor such as the RStudio Editor, but it is more reproducible if it is entered in R code. write.model() is a functions provided by R2OpenBUGS that takes the code stored in a function and writes it to a text file.

library(R2OpenBUGS)

# --- function that stores the model code ---------------------------
modelFunc <- function() {
  for( i in 1:560 ) {
    log(mu[i]) <- b0 + b1*year[i] + b2[age[i]] + b3*gender[i] + offset[i]
    deaths[i] ~ dpois(mu[i])
  }
  b0 ~ dnorm(0, 0.0001)
  b1 ~ dnorm(0, 0.0001)
  b2[1] <- 0.0
  for(j in 2:14) {
    b2[j] ~ dnorm(0, 0.0001)
  }
  b3 ~ dnorm(0, 0.0001)
}

# --- R2OpenBUGS function that exports the code ---------------------
write.model(modelFunc, file.path(home, "bugs/alcModel.txt"))

I organise my work by saving the model code in a folder called bugs and by using a temp folder as my working directory.

Preparing the data

All of the data that is used by the model must be saved in a single R list. This is also written to a text file, but it is done behind the scenes by the R2OpenBUGS.

bugsData <- list( deaths = alcDF$deaths,
                  offset = log(alcDF$pop),
                  year   = alcDF$year - 2001,
                  gender = as.numeric( alcDF$gender == "male"),
                  age    = as.numeric( alcDF$age))

Initial values

Gibbs sampling requires a starting point. The chosen initial values are also placed in a list. This can be done in two ways; as a function, or as a list of lists.

# --- 1: initial values as a function --------------------------
bugsInits <- function() {
  list( b0=0, b1=0, b2=c(NA, rep(0,13)), b3=0)
}

# --- 2: initial values as a list of lists ---------------------
bugsInits <- list(
  list( b0=0, b1=0, b2=c(NA, rep(0,13)), b3=0)
)

Making a list containing a list may, at first sight, look redundant, but often we will want to run multiple chains, in which case each set of initial values is placed in a separate internal list.

Notice that the first level of b2, which I set to zero in my model code, is specified as missing in the initial values.

Model fitting in OpenBUGS

Here is the code that fits the model using OpenBUGS. I use the system.time() function for timing. The options are more or less self-explanatory, but details are given in the help file.

library(R2OpenBUGS)

# --- path to OpenBUGS on my computer ---------------------------
openbugsExe <- "C:/Software/OpenBUGS/OpenBUGS323/OpenBUGS.exe"

# --- fit model using OpenBUGS ----------------------------------
bugs( data       = bugsData, 
      inits      = bugsInits, 
      parameters = c("b0", "b1", "b2", "b3"), 
      model.file = file.path(home, "bugs/alcModel.txt"),
      n.chains   = 1, 
      n.iter     = 1500, 
      n.burnin   = 500,
      n.thin     = 1, 
      DIC        = FALSE, 
      bugs.seed  = 6,
      OpenBUGS   = openbugsExe,
      working.directory = file.path(home, "temp")
    ) %>%
  saveRDS( file.path( home, "data/dataStore/alcBugs01.rds")) %>%
  system.time

On my computer this code took 7.3 seconds to run.

Accessing the Results in R

The structure returned by the bugs() function is quite complex as we can see using str().

# --- read the save bugs() results -----------------------------------
results <- readRDS( file.path( home, "data/dataStore/alcBugs01.rds"))

# --- inspect the structure of the object ----------------------------
str(results)

## List of 20
##  $ n.chains       : num 1
##  $ n.iter         : num 1500
##  $ n.burnin       : num 500
##  $ n.thin         : num 1
##  $ n.keep         : num 1000
##  $ n.sims         : num 1000
##  $ sims.array     : num [1:1000, 1, 1:16] -1.3 -1.31 -1.29 -1.3 -1.3 ...
##   ..- attr(*, "dimnames")=List of 3
##   .. ..$ : NULL
##   .. ..$ : NULL
##   .. ..$ : chr [1:16] "b0" "b1" "b2[2]" "b2[3]" ...
##  $ sims.list      :List of 4
##   ..$ b0: num [1:1000] -1.55 -1.42 -1.56 -1.56 -1.51 ...
##   ..$ b1: num [1:1000] 0.0142 0.0141 0.0145 0.0144 0.0136 ...
##   ..$ b2: num [1:1000, 1:13] 1.38 1.29 1.39 1.33 1.34 ...
##   ..$ b3: num [1:1000] 0.781 0.782 0.778 0.778 0.773 ...
##  $ sims.matrix    : num [1:1000, 1:16] -1.55 -1.42 -1.56 -1.56 -1.51 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:16] "b0" "b1" "b2[2]" "b2[3]" ...
##  $ summary        : num [1:16, 1:7] -1.51 0.0143 1.3254 2.3552 3.0782 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:16] "b0" "b1" "b2[2]" "b2[3]" ...
##   .. ..$ : chr [1:7] "mean" "sd" "2.5%" "25%" ...
##  $ mean           :List of 4
##   ..$ b0: num -1.51
##   ..$ b1: num 0.0143
##   ..$ b2: num [1:13(1d)] 1.33 2.36 3.08 3.63 4 ...
##   ..$ b3: num 0.778
##  $ sd             :List of 4
##   ..$ b0: num 0.0661
##   ..$ b1: num 0.000442
##   ..$ b2: num [1:13(1d)] 0.0734 0.0671 0.0668 0.0661 0.0662 ...
##   ..$ b3: num 0.00579
##  $ median         :List of 4
##   ..$ b0: num -1.53
##   ..$ b1: num 0.0143
##   ..$ b2: num [1:13(1d)] 1.34 2.38 3.1 3.65 4.03 ...
##   ..$ b3: num 0.778
##  $ root.short     : chr [1:4] "b0" "b1" "b2" "b3"
##  $ long.short     :List of 4
##   ..$ : int 1
##   ..$ : int 2
##   ..$ : int [1:13] 3 4 5 6 7 8 9 10 11 12 ...
##   ..$ : int 16
##  $ dimension.short: num [1:4] 0 0 1 0
##  $ indexes.short  :List of 4
##   ..$ : NULL
##   ..$ : NULL
##   ..$ :List of 13
##   .. ..$ : num 2
##   .. ..$ : num 3
##   .. ..$ : num 4
##   .. ..$ : num 5
##   .. ..$ : num 6
##   .. ..$ : num 7
##   .. ..$ : num 8
##   .. ..$ : num 9
##   .. ..$ : num 10
##   .. ..$ : num 11
##   .. ..$ : num 12
##   .. ..$ : num 13
##   .. ..$ : num 14
##   ..$ : NULL
##  $ last.values    :List of 1
##   ..$ :List of 4
##   .. ..$ b0: num -1.57
##   .. ..$ b1: num 0.0145
##   .. ..$ b2: num [1:13] 1.33 2.43 3.14 3.7 4.07 ...
##   .. ..$ b3: num 0.78
##  $ isDIC          : logi FALSE
##  $ model.file     : chr "C:/Projects/kaggle/sliced/methods/methods_bayes_software/bugs/alcModel.txt"
##  - attr(*, "class")= chr "bugs"

R2OpenBUGS defines a version of the print() function for bugs objects that summarises the results.

print(results, digits=2)

## Inference for Bugs model at "C:/Projects/kaggle/sliced/methods/methods_bayes_software/bugs/alcModel.txt", 
## Current: 1 chains, each with 1500 iterations (first 500 discarded)
## Cumulative: n.sims = 1000 iterations saved
##         mean   sd  2.5%   25%   50%   75% 97.5%
## b0     -1.51 0.07 -1.58 -1.55 -1.53 -1.49 -1.32
## b1      0.01 0.00  0.01  0.01  0.01  0.01  0.02
## b2[2]   1.33 0.07  1.13  1.30  1.34  1.38  1.42
## b2[3]   2.36 0.07  2.18  2.34  2.38  2.40  2.44
## b2[4]   3.08 0.07  2.89  3.06  3.10  3.12  3.15
## b2[5]   3.63 0.07  3.44  3.62  3.65  3.68  3.70
## b2[6]   4.00 0.07  3.82  3.99  4.03  4.05  4.07
## b2[7]   4.19 0.07  4.01  4.18  4.22  4.24  4.26
## b2[8]   4.27 0.07  4.08  4.25  4.29  4.31  4.34
## b2[9]   4.24 0.07  4.06  4.23  4.27  4.29  4.31
## b2[10]  4.09 0.07  3.89  4.07  4.11  4.13  4.16
## b2[11]  3.73 0.07  3.54  3.71  3.75  3.77  3.80
## b2[12]  3.36 0.07  3.17  3.34  3.38  3.40  3.44
## b2[13]  2.88 0.07  2.70  2.87  2.90  2.93  2.97
## b2[14]  2.23 0.07  2.05  2.20  2.25  2.28  2.34
## b3      0.78 0.01  0.77  0.77  0.78  0.78  0.79

However, I find it helpful to have the simulations in a data frame with one column for each parameter and one row for each iteration. My function, bugs_to_df(), extracts the simulations and adds columns denoting the chain number and iteration number. The code for this function is given in the appendix. I have placed bugs_to_df() in a package called MyPackage.

library(MyPackage)

# --- results into a tibble ----------------------------------------
simDF <- bugs_to_df(results)

# --- show the tibble ----------------------------------------------
print(simDF)

## # A tibble: 1,000 x 18
##    chain  iter    b0     b1  b2_2  b2_3  b2_4  b2_5  b2_6  b2_7  b2_8  b2_9
##    <fct> <int> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 1         1 -1.55 0.0142  1.38  2.42  3.11  3.68  4.04  4.23  4.30  4.28
##  2 1         2 -1.42 0.0141  1.29  2.28  2.96  3.53  3.90  4.09  4.17  4.14
##  3 1         3 -1.56 0.0145  1.39  2.40  3.13  3.69  4.05  4.24  4.32  4.29
##  4 1         4 -1.56 0.0144  1.33  2.41  3.12  3.68  4.05  4.25  4.32  4.29
##  5 1         5 -1.51 0.0136  1.34  2.37  3.08  3.64  4.01  4.20  4.28  4.25
##  6 1         6 -1.56 0.0143  1.40  2.42  3.12  3.67  4.06  4.25  4.31  4.29
##  7 1         7 -1.56 0.0140  1.38  2.40  3.16  3.68  4.06  4.23  4.33  4.30
##  8 1         8 -1.41 0.0144  1.23  2.25  2.98  3.52  3.90  4.09  4.17  4.14
##  9 1         9 -1.54 0.0136  1.34  2.40  3.12  3.68  4.05  4.22  4.29  4.28
## 10 1        10 -1.57 0.0133  1.38  2.40  3.16  3.70  4.08  4.27  4.33  4.29
## # ... with 990 more rows, and 6 more variables: b2_10 <dbl>, b2_11 <dbl>,
## #   b2_12 <dbl>, b2_13 <dbl>, b2_14 <dbl>, b3 <dbl>

Obviously this is just the beginning. Now I need to assess whether the chain has converged and if it has, I need to summarise the results, but that is for another time.

Multiple Chains

To run multiple chains in OpenBUGS, we just need to alter the n.chains argument in the bugs() call. However, this will run the chains one after the other, there is no built-in parallel processing.

I want to run the chains from different starting positions in order to see if they converge to the same solution. One option is to set the initial values randomly. Here all of the values are drawn from N(0, 0.5) distributions. With random initial values, it is important to set a seed for reproducibility.

set.seed(8992)

bugsInits <- function() {
  list( b0=rnorm(1, 0, sd=0.5), 
        b1=rnorm(1, 0, sd=0.5), 
        b2=c(NA, rnorm(13, 0, sd=0.5)), 
        b3=rnorm(1, 0, sd=0.5))
}

Alternatively, I could choose the initial values with a list of lists. Here are my initial values for 3 chains.

bugsInits <- list(
  list( b0=-1, b1=-0.1, b2=c(NA, rep(0,13)), b3= 1),
  list( b0= 0, b1= 0.0, b2=c(NA, rep(0,13)), b3= 0),
  list( b0= 1, b1= 0.1, b2=c(NA, rep(0,13)), b3=-1)
)

It is trivial to change n.chains and re-run the bugs() function, so I’ll omit it to save space.

Parallel Processing

The are two levels of parallel processing. The simpler form just runs the chains on separate cores. The more complex form also parallelises the tasks required for the calculations within a single chain.

MultiBUGS

MultiBUGS is a project to create the complex form of parallel processing. MultiBUGS is still under development, but a beta version is available for download. It is accompanied by an R package R2MultiBUGS that mirrors R2OpenBUGS, so the structure of a MultiBUGS run in R uses almost identical code to the OpenBUGS example.

To install MultiBUGS go to https://github.com/MultiBUGS/MultiBUGS#installation and to install R2MultiBUGS go to https://github.com/MultiBUGS/R2MultiBUGS

When I experimented with MultiBUGS, the program ran, but fell foul of my virus checker (McAfee on Windows). McAfee closes MultiBUGS and quarantines the executable. As yet, I have not found a way around this problem.

Multiple Cores

If you search the internet, you will find some old blog posts that show how to set up a multi-core BUGS analysis using the snow package. snow is rather dated and I will show a more modern method based on the furrr package, which produces reproducible runs.

I use Windows, so I have to run each chain in a separate R session. These sessions are completely independent of one another, for instance they must all have their own copy of the data. Such independence requires a separate working directory for each chain. R2OpenBUGS works by writing the data, initial values etc. to text files. If the different processes were to use the same working directory, these files would over-write one another.

I want to create 3 simultaneous OpenBUGS chains using 3 of my processor’s cores and so I start by creating three temporary working directories.

# --- Create 3 working directories ---------------------------------
for( i in 1:3 ) {
  # --- address of the working directory ---------------------------
  folder <- file.path( home, paste("temp/chain", i, sep=""))
  # --- delete if it already exists -----------------------------
  if( dir.exists(folder)) {
    unlink(folder, recursive=TRUE)
  }
  # --- create the working directory -------------------------------
  dir.create(folder)
}

Next I set up a tibble that describes the 3 runs that I want to complete.

# --- save the bugs data in an rds file --------------------------
saveRDS( bugsData, file.path(home, "bugs/alcData.rds"))

# --- create the analysis tibble ---------------------------------
set.seed(1782)
tibble( 
        model   = rep( file.path(home, "bugs/alcModel.txt"), 3),
        data    = rep( file.path(home, "bugs/alcData.rds"), 3),
        niter   = rep(1500, 3),
        nburnin = rep(500, 3),
        thin    = rep(1, 3),
        seed    = c(3, 4, 5),
        inits   = list(
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)),
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)),
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)) ),
        wDir    = paste(
          file.path(home, "temp/chain"), 1:3, sep=""),
        sims    = paste(
          file.path(home, "data/dataStore/bugs_1_"), 1:3,".rds", sep="")
  ) %>%
  print() -> bugsRunDF

## # A tibble: 3 x 9
##   model       data        niter nburnin  thin  seed inits  wDir       sims      
##   <chr>       <chr>       <dbl>   <dbl> <dbl> <dbl> <list> <chr>      <chr>     
## 1 C:/Project~ C:/Project~  1500     500     1     3 <name~ C:/Projec~ C:/Projec~
## 2 C:/Project~ C:/Project~  1500     500     1     4 <name~ C:/Projec~ C:/Projec~
## 3 C:/Project~ C:/Project~  1500     500     1     5 <name~ C:/Projec~ C:/Projec~

  # --- save as permanent record of the analysis ---------------------
  saveRDS( bugsRunDF, file.path(home, "data/dataStore/bugs_df01.rds"))

Next, a function must be created that completes the whole analysis for a single chain. This function must be completely stand-alone, it cannot refer to any R objects in the calling environment unless they are passed as arguments. It can, however, read and write to file.

# --- function to run a single chain ------------------------------
# arguments correspond to the columns of the analysis tibble
#
# returns
#    nothing .. results written to rds files
#
run_bugs <- function(model, data, niter, nburnin, thin, seed,
                          inits, wDir, sims)
{
  library(R2OpenBUGS)
  
  # --- OpenBUGS executable
  obExe   <- "C:/Software/OpenBUGS/OpenBUGS323/OpenBUGS.exe"
  
  # --- call to bugs() ---------------------------------------------
  bugs( data = readRDS(data),
        inits = list(inits),
        parameters = c("b0", "b1", "b2", "b3"),
        model.file = model,
        n.chains = 1,
        n.iter = niter,
        n.burnin = nburnin,
        n.thin = thin,
        digits = 5,
        codaPkg = FALSE,
        bugs.seed = seed,
        OpenBUGS = obExe,
        working.directory = wDir
  ) %>%
  # --- save the returned bugs object -----------------------------
     saveRDS(sims)
}

Now I can use purrr or furrr to run these chains sequentially or in parallel. naturally, I will use furrr. The function required is future_pwalk(), which is the equivalent of purrr’s pwalk(). The walk() function is used when we want to iterate a function for its side effects, i.e. the function does not return anything. pwalk() is the version that takes multiple arguments; here the arguments are all of the entries in a single row of the analysis tibble.

library(furrr)
# --- run three independent R sessions --------------
plan(multisession, workers=3)

# --- run the chains in parallel --------------------
bugsRunDF %>%
  future_pwalk(run_bugs) %>%
  system.time()

# --- switch back to sequential processing ----------
plan(sequential)

Three chains in 7.75s, more or less the same as it took to run one chain.

Now I can read the results and bind them into a single data frame. Since, running multiple chains is the norm, I have written a function bayes_to_df() that takes a vector as rds file names and returns a data frame of the simulations that they contain. The code for the function is in the appendix.

# --- read the simulations --------------------------------
bayes_to_df(bugsRunDF$sims) %>%
  print() -> simDF

## # A tibble: 3,000 x 19
##    chain  iter    b0     b1  b2_2  b2_3  b2_4  b2_5  b2_6  b2_7  b2_8  b2_9
##    <fct> <int> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 1         1 -1.55 0.0148  1.41  2.39  3.10  3.67  4.05  4.23  4.30  4.29
##  2 1         2 -1.54 0.0144  1.35  2.38  3.11  3.68  4.03  4.21  4.28  4.27
##  3 1         3 -1.31 0.0142  1.12  2.15  2.86  3.44  3.80  3.99  4.08  4.06
##  4 1         4 -1.32 0.0142  1.11  2.18  2.88  3.45  3.81  3.99  4.07  4.05
##  5 1         5 -1.52 0.0146  1.36  2.38  3.10  3.66  4.01  4.20  4.29  4.25
##  6 1         6 -1.35 0.0136  1.15  2.18  2.93  3.47  3.84  4.04  4.12  4.09
##  7 1         7 -1.28 0.0134  1.08  2.13  2.86  3.42  3.79  3.97  4.06  4.04
##  8 1         8 -1.31 0.0144  1.10  2.16  2.88  3.42  3.80  4.00  4.07  4.04
##  9 1         9 -1.40 0.0149  1.22  2.24  2.97  3.51  3.88  4.07  4.13  4.12
## 10 1        10 -1.40 0.0144  1.22  2.26  2.95  3.51  3.88  4.08  4.15  4.14
## # ... with 2,990 more rows, and 7 more variables: b2_10 <dbl>, b2_11 <dbl>,
## #   b2_12 <dbl>, b2_13 <dbl>, b2_14 <dbl>, b3 <dbl>, deviance <dbl>

nimble

The R package, nimble, uses a language very similar to BUGS but instead of passing the problem to an external program, the model code and chosen samplers are translated into C++, compiled, linked and run. For a comprehensive description of the package, see https://r-nimble.org/.

nimble does extend the BUGS language in a few ways, such as allowing different parameterisations of its distributions, but these differences are minor so, for my simple example, I can use the BUGS model code asis.

The function nimbleCode is the package’s way of placing the code in a function, only this time it is not written to a text file, but saved as an R object.

library(nimble)

nimbleCode( {
  for( i in 1:560 ) {
    log(mu[i]) <- b0 + b1*year[i] + b2[age[i]] + b3*gender[i] + offset[i]
    deaths[i] ~ dpois(mu[i])
  }
  b0 ~ dnorm(0, 0.0001)
  b1 ~ dnorm(0, 0.0001)
  b2[1] <- 0.0
  for(j in 2:14) {
    b2[j] ~ dnorm(0, 0.0001)
  }
  b3 ~ dnorm(0, 0.0001)
} )                       -> modelCode

The data are placed in a list identical to that used by OpenBUGS.

nimbleData <- list( deaths = alcDF$deaths,
                    offset = log(alcDF$pop),
                    year   = alcDF$year - 2001,
                    gender = as.numeric( alcDF$gender == "male"),
                    age    = as.numeric( alcDF$age))

The initial values go into a list, rather than a list of lists.

nimbleInits <- 
  list( b0=0, b1=0, b2=c(NA, rep(0,13)), b3=0)

Next, the code, data and initial values are combined and compiled

# --- create the model ---------------------------------
nimbleModel(
  code  = modelCode,
  data  = nimbleData,
  inits = nimbleInits
)                         -> model
# --- Compile the model ---------------------------------
modelCompiled <- compileNimble(model)

Now, the samplers are compiled/linked with the model. I use the default samplers chosen by nimble, but I could have made my own choices.

# --- build the sampler ---------------------------------
buildMCMC(
  conf     = modelCompiled,
  monitors = c("b0", "b1", "b2", "b3")
)                          -> nimbleMCMC
# --- final compilation ---------------------------------
mcmcCompiled <- compileNimble(nimbleMCMC)

Finally, the compiled mcmc code is run.

# --- run the compiled nimble code ----------------------
runMCMC(
  mcmc    = mcmcCompiled,
  niter   = 1500,
  nburnin = 500,
  setSeed = 1832
  )  %>%
  saveRDS( file.path( home, "data/dataStore/alcNimble01.rds")) %>%
  system.time()

nimble takes less than half the time required by OpenBUGS. However, this time does not allow for compilation.

The structure returned by nimble is much simpler than that of R2OpenBUGS.

results <- readRDS(file.path( home, "data/dataStore/alcNimble01.rds"))

str(results)

##  num [1:1000, 1:17] 0.422 0.422 0.422 0.422 0.422 ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : NULL
##   ..$ : chr [1:17] "b0" "b1" "b2[1]" "b2[2]" ...

My function for placing these results into a tibble is called nimble_to_df() and is given in the Appendix. Alternatively, the function bayes_to_df() will return a data frame from the name of an rds file(s).

# --- read the results for the example ------------------------
simDF <- nimble_to_df(results)

# --- show the results ----------------------------------------
print(simDF)

## # A tibble: 1,000 x 19
##    chain  iter    b0      b1  b2_1   b2_2  b2_3  b2_4  b2_5  b2_6  b2_7  b2_8
##    <fct> <int> <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
##  1 1         1 0.422 0.00141     0 -0.424 0.539  1.30  1.86  2.21  2.40  2.47
##  2 1         2 0.422 0.00141     0 -0.424 0.606  1.30  1.86  2.21  2.40  2.47
##  3 1         3 0.422 0.00141     0 -0.424 0.606  1.30  1.86  2.23  2.40  2.47
##  4 1         4 0.422 0.00141     0 -0.464 0.606  1.30  1.86  2.23  2.40  2.47
##  5 1         5 0.422 0.00141     0 -0.464 0.606  1.30  1.86  2.23  2.40  2.46
##  6 1         6 0.422 0.00141     0 -0.439 0.606  1.30  1.86  2.23  2.40  2.46
##  7 1         7 0.422 0.00141     0 -0.439 0.606  1.30  1.86  2.23  2.40  2.46
##  8 1         8 0.422 0.00141     0 -0.508 0.608  1.30  1.86  2.23  2.42  2.46
##  9 1         9 0.422 0.00141     0 -0.465 0.608  1.30  1.86  2.23  2.41  2.46
## 10 1        10 0.422 0.00141     0 -0.478 0.608  1.30  1.86  2.23  2.41  2.46
## # ... with 990 more rows, and 7 more variables: b2_9 <dbl>, b2_10 <dbl>,
## #   b2_11 <dbl>, b2_12 <dbl>, b2_13 <dbl>, b2_14 <dbl>, b3 <dbl>

Notice that the b0 does not move over the first 10 iterations, this is a consequence of nimble’s choice of samplers; I could do better, but that is not the point of this post.

Parallel Nimble

nimble is made parallel in much the same way as OpenBUGS. The entire nimble analysis, including the compilations, are placed in a function that is called by future_pwalk(). I believe that the authors of nimble have plans to improve the parallel processing, so it might be worth checking their website.

First, I write the code and data to an rds file and the create an analysis data frame. There is no need for separate working directories as model components are not written to text files as in OpenBUGS.

saveRDS( modelCode, file.path(home, "nimble/alcModel.rds"))

# --- save the nimble data in an rds file --------------------------
saveRDS( nimbleData, file.path(home, "nimble/alcData.rds"))

# --- create the analysis tibble ---------------------------------
set.seed(1782)
tibble( 
        model   = rep( file.path(home, "nimble/alcModel.rds"), 3),
        data    = rep( file.path(home, "nimble/alcData.rds"), 3),
        niter   = rep(1500, 3),
        nburnin = rep(500, 3),
        thin    = rep(1, 3),
        seed   = c(3328, 4309, 5881),
        inits   = list(
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)),
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)),
                    list( b0=rnorm(1, 0, sd=0.5), 
                          b1=rnorm(1, 0, sd=0.5), 
                          b2=c(NA, rnorm(13, 0, sd=0.5)), 
                          b3=rnorm(1, 0, sd=0.5)) ),
        sims    = paste(
          file.path(home, "data/dataStore/nimble_1_"), 1:3,".rds", sep="")
  ) %>%
  print() -> nimbleRunDF
  # --- save as permanent record of the analysis ---------------------
  saveRDS( nimbleRunDF,  file.path(home, "data/dataStore/nimble_df01.rds"))

# --- a function for all stages of the nimble analysis ---------------
run_nimble <- function(model, data, niter, nburnin, thin, seed,
                          inits, sims)
{
  library(nimble)
  
  # --- create the model and compile ------------------------
  nimbleModel(code  = readRDS(model),
              data  = readRDS(data),
              inits = inits) %>%
     compileNimble() %>%
  # --- choose samplers and link ----------------------------
     buildMCMC() %>%
     compileNimble() %>%
  # --- run the chain ---------------------------------------
     runMCMC(niter = niter, nburnin = nburnin,
                     thin = thin, setSeed = seed) %>%
  # --- Save the results ------------------------------------
     saveRDS(sims)
}

Next, I create a cluster of 3 cores and run nimble independently on each of the cores. The method is the same as that used with OpenBUGS.

library(furrr)
# --- run three independent R sessions --------------
plan(multisession, workers=3)

# --- run the chains in parallel --------------------
nimbleRunDF %>%
  future_pwalk(run_nimble) %>%
  system.time()

# --- switch back to sequential processing ----------
plan(sequential)

bayes_to_df(nimbleRunDF$sims) %>%
  print() -> simDF

The program appears to be slower than OpenBUGS, but the time includes compilation. For longer, more realistic chains, compilation is a smaller fraction of the total time and the speed of nimble is very competitive.

Stan

Stan uses the HMC algorithm, which for some problems performs better than the Gibbs sampling of BUGS and its derivatives. Much like nimble, Stan turns the model into C++ code and compiles it, but unlike nimble, it modifies the BUGS language to make it easier to optimise the performance of its compiled code. The chief difference lies in the requirement to define the size, type and range of the data, parameters and intermediate variables. That said, the code is noticeably BUGS-like.

A Stan model is created in sections, though not all of them are needed for every problem. The sections are,

data: a description of the size, type and range of the data
transformed data: variables derived by transforming the data
parameters: a description of the size, type and range of the parameters
transformed parameters: variables derived by transforming the parameters
model: the code for the likelihood and prior (much like BUGS)
generated quantities: values derived from the simulations, including predictions

Stan has more functions than BUGS and a greater variety of parameterisations for its distributions. It also adopts simpler distribution names, such as normal() instead of dnorm() for the normal distribution. Stan uses different default parameterisations.

In Stan, lines end with a semi-colon and = replaces <-.

The Poisson regression problem might be coded as,

data {
  int N;
  int deaths[N];
  vector[N] year;
  int<lower=1, upper=14> age[N];
  vector[N] gender;
  vector[N] offset;
}

parameters {
  real b0;
  real b1;
  vector[13] b2;
  real b3;
}

transformed parameters {
  real mu[N];
  vector[14] b4;

  b4[1] = 0.0;
  for( i in 2:14 ) {
    b4[i] = b2[i-1];
  }
  for( i in 1:N) {
    mu[i] = exp(b0 + b1*year[i] + b4[age[i]] + b3*gender[i] + offset[i]);
  }
}

model {
  deaths ~ poisson(mu);
  
  b0 ~ normal(0, 10);
  b1 ~ normal(0, 10);
  for(j in 1:13) {
    b2[j] ~ normal(0, 10);
  }
  b3 ~ normal(0, 10);
}

Since I tell Stan that age is an integer between 1 and 14, it will check the data when it is read. Giving types and ranges for the parameters helps Stan to create more efficient code. Typically, types and ranges can be omitted and the program will still work, just not as quickly.

Notice that Stan is vectorised, so it is possible to write deaths ~ poisson(mu) and, because deaths has been defined as a vector, Stan will expand the calculation without us needing to give an explicit loop. Had I made the loop explicit, it would still have worked.

By default, Stan parameterises the normal distribution in terms of its mean and standard deviation. See the Stan functions guide for more details, https://mc-stan.org/docs/2_20/functions-reference/normal-distribution.html.

The Stan documentation has a page on transitioning from BUGS that might be helpful if you start with some familiarity with BUGS. See https://mc-stan.org/docs/2_28/stan-users-guide/stan-for-bugs.html.

Running RStan

Saving the model code

The model code should be written to a text file, just as for R2OpenBUGS. Conventionally these files are given a .stan extension. My file is called alcModel.stan and is saved in a folder called stan

Preparing the data

Data are placed in a list, just as with BUGS.

stanData <- list( N      = 560,
                  deaths = as.integer(alcDF$deaths),
                  offset = log(alcDF$pop),
                  year   = as.numeric(alcDF$year - 2001),
                  gender = as.numeric( alcDF$gender == "male"),
                  age    = as.integer( alcDF$age))

Preparing the initial values

No need. Stan has an initial warm-up phase during which it tunes the algorithm; at the same time, it finds good starting values for you.

Running the sampler

There is a stan() function in rstan that is directly analogous to the bugs() function of R2OpenBUGS.

Compilation etc. is all handled for you and parallel processing is built-in.

library(rstan)

stan(
  file = file.path(home, "stan/alcModel.stan"),  # Stan program
  data = stanData,         # named list of data
  pars = c("b0", "b1", "b2", "b3"), # parameters to monitor
  chains = 3,              # number of Markov chains
  warmup = 500,            # number of warmup iterations per chain
  iter   = 1000,           # total number of iterations per chain
  cores  = 3,              # number of cores
  ) %>%
saveRDS( file.path( home, "data/dataStore/alcStanP01.rds")) %>%
system.time()

Like nimble the time taken by stan is in large part a result of the compilation and stan becomes more competitive when the computation takes longer.

The object returned by rstan is quite complex but my function stan_to_df(), which is given in the appendix, will extract the results into a tibble.

# --- read the stan results --------------------------------
results <- readRDS(file.path( home, "data/dataStore/alcStanP01.rds"))
# --- look at the structure --------------------------------
str(results)

## Formal class 'stanfit' [package "rstan"] with 10 slots
##   ..@ model_name: chr "alcModel"
##   ..@ model_pars: chr [1:7] "b0" "b1" "b2" "b3" ...
##   ..@ par_dims  :List of 7
##   .. ..$ b0  : num(0) 
##   .. ..$ b1  : num(0) 
##   .. ..$ b2  : num 13
##   .. ..$ b3  : num(0) 
##   .. ..$ mu  : num 560
##   .. ..$ b4  : num 14
##   .. ..$ lp__: num(0) 
##   ..@ mode      : int 0
##   ..@ sim       :List of 12
##   .. ..$ samples    :List of 3
##   .. .. ..$ :List of 17
##   .. .. .. ..$ b0    : num [1:1000] -0.872 -0.872 -0.872 -0.872 -0.872 ...
##   .. .. .. ..$ b1    : num [1:1000] -0.123 -0.123 -0.123 -0.123 -0.123 ...
##   .. .. .. ..$ b2[1] : num [1:1000] 1.02 1.02 1.02 1.02 1.02 ...
##   .. .. .. ..$ b2[2] : num [1:1000] 1.09 1.09 1.09 1.09 1.09 ...
##   .. .. .. ..$ b2[3] : num [1:1000] -1.43 -1.43 -1.43 -1.43 -1.43 ...
##   .. .. .. ..$ b2[4] : num [1:1000] -0.0502 -0.0502 -0.0502 -0.0502 -0.0502 ...
##   .. .. .. ..$ b2[5] : num [1:1000] -1.79 -1.79 -1.79 -1.79 -1.79 ...
##   .. .. .. ..$ b2[6] : num [1:1000] -0.862 -0.862 -0.862 -0.862 -0.862 ...
##   .. .. .. ..$ b2[7] : num [1:1000] 0.711 0.711 0.711 0.711 0.711 ...
##   .. .. .. ..$ b2[8] : num [1:1000] -0.634 -0.634 -0.634 -0.634 -0.634 ...
##   .. .. .. ..$ b2[9] : num [1:1000] -1.55 -1.55 -1.55 -1.55 -1.55 ...
##   .. .. .. ..$ b2[10]: num [1:1000] 1.83 1.83 1.83 1.83 1.83 ...
##   .. .. .. ..$ b2[11]: num [1:1000] -0.427 -0.427 -0.427 -0.427 -0.427 ...
##   .. .. .. ..$ b2[12]: num [1:1000] 0.376 0.376 0.376 0.376 0.376 ...
##   .. .. .. ..$ b2[13]: num [1:1000] -0.533 -0.533 -0.533 -0.533 -0.533 ...
##   .. .. .. ..$ b3    : num [1:1000] -1.57 -1.57 -1.57 -1.57 -1.57 ...
##   .. .. .. ..$ lp__  : num [1:1000] -101805 -101805 -101805 -101805 -101805 ...
##   .. .. .. ..- attr(*, "test_grad")= logi FALSE
##   .. .. .. ..- attr(*, "args")=List of 16
##   .. .. .. .. ..$ append_samples    : logi FALSE
##   .. .. .. .. ..$ chain_id          : num 1
##   .. .. .. .. ..$ control           :List of 12
##   .. .. .. .. .. ..$ adapt_delta      : num 0.8
##   .. .. .. .. .. ..$ adapt_engaged    : logi TRUE
##   .. .. .. .. .. ..$ adapt_gamma      : num 0.05
##   .. .. .. .. .. ..$ adapt_init_buffer: num 75
##   .. .. .. .. .. ..$ adapt_kappa      : num 0.75
##   .. .. .. .. .. ..$ adapt_t0         : num 10
##   .. .. .. .. .. ..$ adapt_term_buffer: num 50
##   .. .. .. .. .. ..$ adapt_window     : num 25
##   .. .. .. .. .. ..$ max_treedepth    : int 10
##   .. .. .. .. .. ..$ metric           : chr "diag_e"
##   .. .. .. .. .. ..$ stepsize         : num 1
##   .. .. .. .. .. ..$ stepsize_jitter  : num 0
##   .. .. .. .. ..$ enable_random_init: logi TRUE
##   .. .. .. .. ..$ init              : chr "random"
##   .. .. .. .. ..$ init_list         : NULL
##   .. .. .. .. ..$ init_radius       : num 2
##   .. .. .. .. ..$ iter              : int 1000
##   .. .. .. .. ..$ method            : chr "sampling"
##   .. .. .. .. ..$ random_seed       : chr "827152961"
##   .. .. .. .. ..$ refresh           : int 100
##   .. .. .. .. ..$ sampler_t         : chr "NUTS(diag_e)"
##   .. .. .. .. ..$ save_warmup       : logi TRUE
##   .. .. .. .. ..$ test_grad         : logi FALSE
##   .. .. .. .. ..$ thin              : int 1
##   .. .. .. .. ..$ warmup            : int 500
##   .. .. .. ..- attr(*, "inits")= num [1:590] -0.888 -0.289 1.017 1.091 -1.428 ...
##   .. .. .. ..- attr(*, "mean_pars")= num [1:590] -1.6849 0.0142 1.5011 2.5297 3.2542 ...
##   .. .. .. ..- attr(*, "mean_lp__")= num 691125
##   .. .. .. ..- attr(*, "adaptation_info")= chr "# Adaptation terminated\n# Step size = 0.00849926\n# Diagonal elements of inverse mass matrix:\n# 0.0902026, 2."| __truncated__
##   .. .. .. ..- attr(*, "elapsed_time")= Named num [1:2] 4.81 5.48
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "warmup" "sample"
##   .. .. .. ..- attr(*, "sampler_params")=List of 6
##   .. .. .. .. ..$ accept_stat__: num [1:1000] 1 0 0 0 0 0 1 1 1 0.5 ...
##   .. .. .. .. ..$ stepsize__   : num [1:1000] 4.88e-04 1.44e+01 2.43 2.40e-01 1.86e-02 ...
##   .. .. .. .. ..$ treedepth__  : num [1:1000] 1 0 0 0 0 0 2 1 1 1 ...
##   .. .. .. .. ..$ n_leapfrog__ : num [1:1000] 1 1 1 1 1 1 3 1 1 2 ...
##   .. .. .. .. ..$ divergent__  : num [1:1000] 0 1 1 1 1 1 0 0 0 1 ...
##   .. .. .. .. ..$ energy__     : num [1:1000] 335457 101811 101811 101813 101813 ...
##   .. .. .. ..- attr(*, "return_code")= int 0
##   .. .. ..$ :List of 17
##   .. .. .. ..$ b0    : num [1:1000] 1.71 1.71 1.71 1.71 1.71 ...
##   .. .. .. ..$ b1    : num [1:1000] 0.298 0.298 0.298 0.298 0.298 ...
##   .. .. .. ..$ b2[1] : num [1:1000] -1.9 -1.9 -1.9 -1.9 -1.9 ...
##   .. .. .. ..$ b2[2] : num [1:1000] 0.747 0.747 0.747 0.747 0.747 ...
##   .. .. .. ..$ b2[3] : num [1:1000] 1.63 1.63 1.63 1.63 1.63 ...
##   .. .. .. ..$ b2[4] : num [1:1000] -1.19 -1.19 -1.19 -1.19 -1.19 ...
##   .. .. .. ..$ b2[5] : num [1:1000] -1.71 -1.71 -1.71 -1.71 -1.71 ...
##   .. .. .. ..$ b2[6] : num [1:1000] -1.19 -1.19 -1.19 -1.19 -1.19 ...
##   .. .. .. ..$ b2[7] : num [1:1000] 1.23 1.23 1.23 1.23 1.23 ...
##   .. .. .. ..$ b2[8] : num [1:1000] -0.255 -0.255 -0.255 -0.255 -0.255 ...
##   .. .. .. ..$ b2[9] : num [1:1000] 0.0068 0.0068 0.0068 0.0068 0.0068 ...
##   .. .. .. ..$ b2[10]: num [1:1000] 0.44 0.44 0.44 0.44 0.44 ...
##   .. .. .. ..$ b2[11]: num [1:1000] 0.0699 0.0699 0.0699 0.0699 0.0699 ...
##   .. .. .. ..$ b2[12]: num [1:1000] 1.39 1.39 1.39 1.39 1.39 ...
##   .. .. .. ..$ b2[13]: num [1:1000] -1.27 -1.27 -1.27 -1.27 -1.27 ...
##   .. .. .. ..$ b3    : num [1:1000] -1.45 -1.45 -1.45 -1.45 -1.45 ...
##   .. .. .. ..$ lp__  : num [1:1000] -1905693 -1905693 -1905693 -1905693 -1905693 ...
##   .. .. .. ..- attr(*, "test_grad")= logi FALSE
##   .. .. .. ..- attr(*, "args")=List of 16
##   .. .. .. .. ..$ append_samples    : logi FALSE
##   .. .. .. .. ..$ chain_id          : num 2
##   .. .. .. .. ..$ control           :List of 12
##   .. .. .. .. .. ..$ adapt_delta      : num 0.8
##   .. .. .. .. .. ..$ adapt_engaged    : logi TRUE
##   .. .. .. .. .. ..$ adapt_gamma      : num 0.05
##   .. .. .. .. .. ..$ adapt_init_buffer: num 75
##   .. .. .. .. .. ..$ adapt_kappa      : num 0.75
##   .. .. .. .. .. ..$ adapt_t0         : num 10
##   .. .. .. .. .. ..$ adapt_term_buffer: num 50
##   .. .. .. .. .. ..$ adapt_window     : num 25
##   .. .. .. .. .. ..$ max_treedepth    : int 10
##   .. .. .. .. .. ..$ metric           : chr "diag_e"
##   .. .. .. .. .. ..$ stepsize         : num 1
##   .. .. .. .. .. ..$ stepsize_jitter  : num 0
##   .. .. .. .. ..$ enable_random_init: logi TRUE
##   .. .. .. .. ..$ init              : chr "random"
##   .. .. .. .. ..$ init_list         : NULL
##   .. .. .. .. ..$ init_radius       : num 2
##   .. .. .. .. ..$ iter              : int 1000
##   .. .. .. .. ..$ method            : chr "sampling"
##   .. .. .. .. ..$ random_seed       : chr "827152961"
##   .. .. .. .. ..$ refresh           : int 100
##   .. .. .. .. ..$ sampler_t         : chr "NUTS(diag_e)"
##   .. .. .. .. ..$ save_warmup       : logi TRUE
##   .. .. .. .. ..$ test_grad         : logi FALSE
##   .. .. .. .. ..$ thin              : int 1
##   .. .. .. .. ..$ warmup            : int 500
##   .. .. .. ..- attr(*, "inits")= num [1:590] 1.717 0.373 -1.9 0.747 1.635 ...
##   .. .. .. ..- attr(*, "mean_pars")= num [1:590] -1.68 0.0143 1.4944 2.5243 3.2479 ...
##   .. .. .. ..- attr(*, "mean_lp__")= num 691125
##   .. .. .. ..- attr(*, "adaptation_info")= chr "# Adaptation terminated\n# Step size = 0.0147061\n# Diagonal elements of inverse mass matrix:\n# 0.0371848, 2.4"| __truncated__
##   .. .. .. ..- attr(*, "elapsed_time")= Named num [1:2] 4.58 4.99
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "warmup" "sample"
##   .. .. .. ..- attr(*, "sampler_params")=List of 6
##   .. .. .. .. ..$ accept_stat__: num [1:1000] 1 0 0 0 0 0 1 1 1 1 ...
##   .. .. .. .. ..$ stepsize__   : num [1:1000] 3.05e-05 1.44e+01 2.43 2.40e-01 1.86e-02 ...
##   .. .. .. .. ..$ treedepth__  : num [1:1000] 3 0 0 0 0 0 3 2 2 2 ...
##   .. .. .. .. ..$ n_leapfrog__ : num [1:1000] 7 1 1 1 1 1 7 3 3 3 ...
##   .. .. .. .. ..$ divergent__  : num [1:1000] 0 1 1 1 1 1 0 0 0 0 ...
##   .. .. .. .. ..$ energy__     : num [1:1000] 6771707 1905699 1905698 1905703 1905702 ...
##   .. .. .. ..- attr(*, "return_code")= int 0
##   .. .. ..$ :List of 17
##   .. .. .. ..$ b0    : num [1:1000] -0.266 -0.266 -0.266 -0.266 -0.266 ...
##   .. .. .. ..$ b1    : num [1:1000] 1.8 1.8 1.8 1.8 1.8 ...
##   .. .. .. ..$ b2[1] : num [1:1000] -0.319 -0.319 -0.319 -0.319 -0.319 ...
##   .. .. .. ..$ b2[2] : num [1:1000] 1.63 1.63 1.63 1.63 1.63 ...
##   .. .. .. ..$ b2[3] : num [1:1000] 1.61 1.61 1.61 1.61 1.61 ...
##   .. .. .. ..$ b2[4] : num [1:1000] 1.29 1.29 1.29 1.29 1.29 ...
##   .. .. .. ..$ b2[5] : num [1:1000] 1.94 1.94 1.94 1.94 1.94 ...
##   .. .. .. ..$ b2[6] : num [1:1000] -0.188 -0.188 -0.188 -0.188 -0.188 ...
##   .. .. .. ..$ b2[7] : num [1:1000] 0.648 0.648 0.648 0.648 0.648 ...
##   .. .. .. ..$ b2[8] : num [1:1000] -0.749 -0.749 -0.749 -0.749 -0.749 ...
##   .. .. .. ..$ b2[9] : num [1:1000] 1.26 1.26 1.26 1.26 1.26 ...
##   .. .. .. ..$ b2[10]: num [1:1000] -0.145 -0.145 -0.145 -0.145 -0.145 ...
##   .. .. .. ..$ b2[11]: num [1:1000] 0.304 0.304 0.304 0.304 0.304 ...
##   .. .. .. ..$ b2[12]: num [1:1000] 0.805 0.805 0.805 0.805 0.805 ...
##   .. .. .. ..$ b2[13]: num [1:1000] -0.8 -0.8 -0.8 -0.8 -0.8 ...
##   .. .. .. ..$ b3    : num [1:1000] 1.67 1.67 1.67 1.67 1.67 ...
##   .. .. .. ..$ lp__  : num [1:1000] -2.43e+18 -2.43e+18 -2.43e+18 -2.43e+18 -2.43e+18 ...
##   .. .. .. ..- attr(*, "test_grad")= logi FALSE
##   .. .. .. ..- attr(*, "args")=List of 16
##   .. .. .. .. ..$ append_samples    : logi FALSE
##   .. .. .. .. ..$ chain_id          : num 3
##   .. .. .. .. ..$ control           :List of 12
##   .. .. .. .. .. ..$ adapt_delta      : num 0.8
##   .. .. .. .. .. ..$ adapt_engaged    : logi TRUE
##   .. .. .. .. .. ..$ adapt_gamma      : num 0.05
##   .. .. .. .. .. ..$ adapt_init_buffer: num 75
##   .. .. .. .. .. ..$ adapt_kappa      : num 0.75
##   .. .. .. .. .. ..$ adapt_t0         : num 10
##   .. .. .. .. .. ..$ adapt_term_buffer: num 50
##   .. .. .. .. .. ..$ adapt_window     : num 25
##   .. .. .. .. .. ..$ max_treedepth    : int 10
##   .. .. .. .. .. ..$ metric           : chr "diag_e"
##   .. .. .. .. .. ..$ stepsize         : num 1
##   .. .. .. .. .. ..$ stepsize_jitter  : num 0
##   .. .. .. .. ..$ enable_random_init: logi TRUE
##   .. .. .. .. ..$ init              : chr "random"
##   .. .. .. .. ..$ init_list         : NULL
##   .. .. .. .. ..$ init_radius       : num 2
##   .. .. .. .. ..$ iter              : int 1000
##   .. .. .. .. ..$ method            : chr "sampling"
##   .. .. .. .. ..$ random_seed       : chr "827152961"
##   .. .. .. .. ..$ refresh           : int 100
##   .. .. .. .. ..$ sampler_t         : chr "NUTS(diag_e)"
##   .. .. .. .. ..$ save_warmup       : logi TRUE
##   .. .. .. .. ..$ test_grad         : logi FALSE
##   .. .. .. .. ..$ thin              : int 1
##   .. .. .. .. ..$ warmup            : int 500
##   .. .. .. ..- attr(*, "inits")= num [1:590] -0.263 1.854 -0.319 1.633 1.615 ...
##   .. .. .. ..- attr(*, "mean_pars")= num [1:590] -1.6771 0.0143 1.4971 2.5227 3.2461 ...
##   .. .. .. ..- attr(*, "mean_lp__")= num 691125
##   .. .. .. ..- attr(*, "adaptation_info")= chr "# Adaptation terminated\n# Step size = 0.0243091\n# Diagonal elements of inverse mass matrix:\n# 0.00675896, 2."| __truncated__
##   .. .. .. ..- attr(*, "elapsed_time")= Named num [1:2] 4.73 5.04
##   .. .. .. .. ..- attr(*, "names")= chr [1:2] "warmup" "sample"
##   .. .. .. ..- attr(*, "sampler_params")=List of 6
##   .. .. .. .. ..$ accept_stat__: num [1:1000] 1 0 0 0 0 0 0 0 0 0 ...
##   .. .. .. .. ..$ stepsize__   : num [1:1000] 2.91e-11 1.44e+01 2.43 2.40e-01 1.86e-02 ...
##   .. .. .. .. ..$ treedepth__  : num [1:1000] 2 0 0 0 0 0 0 0 0 0 ...
##   .. .. .. .. ..$ n_leapfrog__ : num [1:1000] 3 1 1 1 1 1 1 1 1 1 ...
##   .. .. .. .. ..$ divergent__  : num [1:1000] 0 1 1 1 1 1 1 1 1 1 ...
##   .. .. .. .. ..$ energy__     : num [1:1000] 6.05e+18 2.43e+18 2.43e+18 2.43e+18 2.43e+18 ...
##   .. .. .. ..- attr(*, "return_code")= int 0
##   .. ..$ chains     : int 3
##   .. ..$ iter       : num 1000
##   .. ..$ thin       : num 1
##   .. ..$ warmup     : num 500
##   .. ..$ n_save     : num [1:3] 1000 1000 1000
##   .. ..$ warmup2    : num [1:3] 500 500 500
##   .. ..$ permutation:List of 3
##   .. .. ..$ : int [1:500] 296 396 423 462 419 6 430 494 234 221 ...
##   .. .. ..$ : int [1:500] 343 34 337 338 486 363 135 400 433 249 ...
##   .. .. ..$ : int [1:500] 329 223 414 453 344 368 250 16 101 325 ...
##   .. ..$ pars_oi    : chr [1:5] "b0" "b1" "b2" "b3" ...
##   .. ..$ dims_oi    :List of 5
##   .. .. ..$ b0  : num(0) 
##   .. .. ..$ b1  : num(0) 
##   .. .. ..$ b2  : num 13
##   .. .. ..$ b3  : num(0) 
##   .. .. ..$ lp__: num(0) 
##   .. ..$ fnames_oi  : chr [1:17] "b0" "b1" "b2[1]" "b2[2]" ...
##   .. ..$ n_flatnames: int 17
##   ..@ inits     :List of 3
##   .. ..$ :List of 6
##   .. .. ..$ b0: num -0.888
##   .. .. ..$ b1: num -0.289
##   .. .. ..$ b2: num [1:13(1d)] 1.017 1.091 -1.428 -0.052 -1.793 ...
##   .. .. ..$ b3: num -1.58
##   .. .. ..$ mu: num [1:560(1d)] 0.0073 0.04 0.0205 0.1002 0.0222 ...
##   .. .. ..$ b4: num [1:14(1d)] 0 1.017 1.091 -1.428 -0.052 ...
##   .. ..$ :List of 6
##   .. .. ..$ b0: num 1.72
##   .. .. ..$ b1: num 0.373
##   .. .. ..$ b2: num [1:13(1d)] -1.9 0.747 1.635 -1.186 -1.706 ...
##   .. .. ..$ b3: num -1.45
##   .. .. ..$ mu: num [1:560(1d)] 32125 155552 4883 21061 69420 ...
##   .. .. ..$ b4: num [1:14(1d)] 0 -1.9 0.747 1.635 -1.186 ...
##   .. ..$ :List of 6
##   .. .. ..$ b0: num -0.263
##   .. .. ..$ b1: num 1.85
##   .. .. ..$ b2: num [1:13(1d)] -0.319 1.633 1.615 1.293 1.938 ...
##   .. .. ..$ b3: num 1.67
##   .. .. ..$ mu: num [1:560(1d)] 1.70e+17 3.60e+16 1.25e+17 2.37e+16 8.90e+17 ...
##   .. .. ..$ b4: num [1:14(1d)] 0 -0.319 1.633 1.615 1.293 ...
##   ..@ stan_args :List of 3
##   .. ..$ :List of 9
##   .. .. ..$ chain_id          : int 1
##   .. .. ..$ iter              : int 1000
##   .. .. ..$ thin              : int 1
##   .. .. ..$ seed              : int 827152961
##   .. .. ..$ warmup            : num 500
##   .. .. ..$ init              : chr "random"
##   .. .. ..$ algorithm         : chr "NUTS"
##   .. .. ..$ check_unknown_args: logi FALSE
##   .. .. ..$ method            : chr "sampling"
##   .. ..$ :List of 9
##   .. .. ..$ chain_id          : int 2
##   .. .. ..$ iter              : int 1000
##   .. .. ..$ thin              : int 1
##   .. .. ..$ seed              : int 827152961
##   .. .. ..$ warmup            : num 500
##   .. .. ..$ init              : chr "random"
##   .. .. ..$ algorithm         : chr "NUTS"
##   .. .. ..$ check_unknown_args: logi FALSE
##   .. .. ..$ method            : chr "sampling"
##   .. ..$ :List of 9
##   .. .. ..$ chain_id          : int 3
##   .. .. ..$ iter              : int 1000
##   .. .. ..$ thin              : int 1
##   .. .. ..$ seed              : int 827152961
##   .. .. ..$ warmup            : num 500
##   .. .. ..$ init              : chr "random"
##   .. .. ..$ algorithm         : chr "NUTS"
##   .. .. ..$ check_unknown_args: logi FALSE
##   .. .. ..$ method            : chr "sampling"
##   ..@ stanmodel :Formal class 'stanmodel' [package "rstan"] with 5 slots
##   .. .. ..@ model_name  : chr "alcModel"
##   .. .. ..@ model_code  : chr "data {\n  int N;\n  int deaths[N];\n  vector[N] year;\n  int<lower=1, upper=14> age[N];\n  vector[N] gender;\n "| __truncated__
##   .. .. .. ..- attr(*, "model_name2")= chr "alcModel"
##   .. .. ..@ model_cpp   :List of 2
##   .. .. .. ..$ model_cppname: chr "model39f86e656888_alcModel"
##   .. .. .. ..$ model_cppcode: chr "// Code generated by Stan version 2.21.0\n\n#include <stan/model/model_header.hpp>\n\nnamespace model39f86e6568"| __truncated__
##   .. .. ..@ mk_cppmodule:function (object)  
##   .. .. ..@ dso         :Formal class 'cxxdso' [package "rstan"] with 7 slots
##   .. .. .. .. ..@ sig         :List of 1
##   .. .. .. .. .. ..$ file39f868cf4f4e: chr(0) 
##   .. .. .. .. ..@ dso_saved   : logi TRUE
##   .. .. .. .. ..@ dso_filename: chr "file39f868cf4f4e"
##   .. .. .. .. ..@ modulename  : chr "stan_fit4model39f86e656888_alcModel_mod"
##   .. .. .. .. ..@ system      : chr "x86_64, mingw32"
##   .. .. .. .. ..@ cxxflags    : chr "CXXFLAGS = -O2 -Wall $(DEBUGFLAG) -mfpmath=sse -msse2 -mstackrealign $(LTO)"
##   .. .. .. .. ..@ .CXXDSOMISC :<environment: 0x00000000258cb340> 
##   ..@ date      : chr "Tue Dec 28 11:38:45 2021"
##   ..@ .MISC     :<environment: 0x000000002604fa48>

# --- extract the result to a tibble -----------------------
simDF <- stan_to_df(results)
# --- show the results -------------------------------------
print(simDF)

## # A tibble: 3,000 x 19
##    chain  iter     b0      b1  b2_1  b2_2  b2_3    b2_4  b2_5   b2_6  b2_7
##    <fct> <int>  <dbl>   <dbl> <dbl> <dbl> <dbl>   <dbl> <dbl>  <dbl> <dbl>
##  1 1         1 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  2 1         2 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  3 1         3 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  4 1         4 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  5 1         5 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  6 1         6 -0.872 -0.123   1.02  1.09 -1.43 -0.0502 -1.79 -0.862 0.711
##  7 1         7 -0.870 -0.101   1.02  1.09 -1.43 -0.0500 -1.79 -0.862 0.711
##  8 1         8 -0.869 -0.0885  1.02  1.09 -1.43 -0.0499 -1.79 -0.861 0.711
##  9 1         9 -0.865 -0.0524  1.02  1.09 -1.43 -0.0501 -1.79 -0.861 0.712
## 10 1        10 -0.854  0.0626  1.02  1.09 -1.43 -0.0493 -1.79 -0.859 0.714
## # ... with 2,990 more rows, and 8 more variables: b2_8 <dbl>, b2_9 <dbl>,
## #   b2_10 <dbl>, b2_11 <dbl>, b2_12 <dbl>, b2_13 <dbl>, b3 <dbl>, lp__ <dbl>

Greta

Greta exists on a knife-edge. It is an R package with many dependencies; it works via python code that calls tensorflow and tensorflow-probability. Only if every step in this process is compatible, will it work.

I installed miniconda and the appropriate version of tensorflow as recommended on the greta website

reticulate::install_miniconda()
reticulate::conda_create(
  envname = "greta-env",
  python_version = "3.7"
)
reticulate::conda_install(
  envname = "greta-env",
  packages = c(
    "numpy==1.16.4",
    "tensorflow-probability==0.7.0",
    "tensorflow==1.14.0"
  )
)

I updated every R package in my library and I download the latest greta from CRAN, no joy. Only when I downloaded the development version of greta could I get even the simplest program to run.

devtools::install_github("greta-dev/greta")

The code below was run with greta 0.4.0 (2021-11-26). Hopefully, when greta is updated to use tensorflow 2.0, it will become more robust.

When it works, it is very impressive.

Here is the poisson regression, written for greta. The whole thing is R code. Parallel computation is created automatically.

library(greta)

# --- prepare the data --------------------------------------
deaths <- as_data(as.integer(alcDF$deaths))
offset <- log(alcDF$pop)
year   <- as.numeric(alcDF$year - 2001)
gender <- as.numeric( alcDF$gender == "male")
age    <- factor( alcDF$age)
# --- construct the design matrix .. as_data() is a greta function
X      <- as_data(model.matrix(~ year + age + gender))

# --- prior on the coefficients -----------------------------
b <- normal(0, 10, dim=16)
# --- the likelihood part of the model ----------------------
mu <- exp(X %*% b + offset)
distribution(deaths) <- poisson(mu)

# --- prepare the greta analysis ----------------------------
alcModel <- model(b)

# --- run three chains --------------------------------------
mcmc(alcModel, n_samples = 1000, chains = 3) %>%
  saveRDS( file.path( home, "data/dataStore/alcGretaP01.rds")) %>%
  system.time()

It is fast and because it is built on tensorflow, greta should scale competitively.

The returned object can be read using my greta_to_df() function.

# --- read the greta results ----------------------------------
results <- readRDS(file.path( home, "data/dataStore/alcGretaP01.rds"))
# --- look at the structure -----------------------------------
str(results)

## List of 3
##  $ 11: 'mcmc' num [1:1000, 1:16] -0.527 -0.538 -0.538 -0.535 -0.546 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:16] "b[1,1]" "b[2,1]" "b[3,1]" "b[4,1]" ...
##   ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##  $ 12: 'mcmc' num [1:1000, 1:16] -0.513 -0.497 -0.504 -0.501 -0.501 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:16] "b[1,1]" "b[2,1]" "b[3,1]" "b[4,1]" ...
##   ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##  $ 13: 'mcmc' num [1:1000, 1:16] -0.569 -0.575 -0.581 -0.558 -0.559 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. ..$ : chr [1:16] "b[1,1]" "b[2,1]" "b[3,1]" "b[4,1]" ...
##   ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##  - attr(*, "class")= chr [1:2] "greta_mcmc_list" "mcmc.list"
##  - attr(*, "model_info")=List of 3
##   ..$ raw_draws:List of 3
##   .. ..$ 11: 'mcmc' num [1:1000, 1:16] -0.527 -0.538 -0.538 -0.535 -0.546 ...
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. .. .. ..$ : chr [1:16] "1" "2" "3" "4" ...
##   .. .. ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##   .. ..$ 12: 'mcmc' num [1:1000, 1:16] -0.513 -0.497 -0.504 -0.501 -0.501 ...
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. .. .. ..$ : chr [1:16] "1" "2" "3" "4" ...
##   .. .. ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##   .. ..$ 13: 'mcmc' num [1:1000, 1:16] -0.569 -0.575 -0.581 -0.558 -0.559 ...
##   .. .. ..- attr(*, "dimnames")=List of 2
##   .. .. .. ..$ : chr [1:1000] "1" "2" "3" "4" ...
##   .. .. .. ..$ : chr [1:16] "1" "2" "3" "4" ...
##   .. .. ..- attr(*, "mcpar")= num [1:3] 1 1000 1
##   .. ..- attr(*, "class")= chr "mcmc.list"
##   ..$ samplers :List of 1
##   .. ..$ 1:Classes 'hmc_sampler', 'sampler', 'inference', 'R6' <hmc_sampler>
##   Inherits from: <sampler>
##   Public:
##     accept_history: FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FA ...
##     accept_target: 0.651
##     burst_lengths: function (n_samples, pb_update, warmup = FALSE) 
##     check_initial_values: function (inits) 
##     clone: function (deep = FALSE) 
##     define_tf_draws: function () 
##     define_tf_kernel: function () 
##     free_state: -1.12015257508273 -1.01928896310261 -0.993346205409014 0 ...
##     hbar: 0.00845652330560466
##     in_periods: function (periods, i, n_samples) 
##     initialize: function (initial_values, model, parameters = list(), seed) 
##     last_burst_free_states: list
##     log_epsilon_bar: -3.97082330013737
##     mean_accept_stat: 0.984376119213626
##     model: greta_model
##     n_chains: 3
##     n_free: 16
##     n_samplers: 1
##     n_traced: 48
##     numerical_rejections: 0
##     parameters: list
##     pb_file: NULL
##     pb_width: 80
##     percentage_file: NULL
##     print_sampler_number: function () 
##     run_burst: function (n_samples, thin = 1L) 
##     run_chain: function (n_samples, thin, warmup, verbose, pb_update, one_by_one, 
##     sample_carefully: function (n_samples) 
##     sample_variance: function () 
##     sampler_number: 1
##     sampler_parameter_values: function () 
##     seed: 883959483492
##     set_initial_values: function (init_list) 
##     set_tf_seed: function () 
##     sum_epsilon_trace: NULL
##     thin: 1
##     trace: function (free_state = TRUE, values = FALSE) 
##     trace_batch_size: 100
##     trace_burst_values: function (free_states = self$last_burst_free_states) 
##     trace_log_file: NULL
##     trace_values: function (trace_batch_size) 
##     traced_free_state: list
##     traced_values: list
##     tune: function (iterations_completed, total_iterations) 
##     tune_diag_sd: function (iterations_completed, total_iterations) 
##     tune_epsilon: function (iter, total) 
##     tuning_interval: 3
##     tuning_periods: list
##     update_welford: function () 
##     uses_metropolis: TRUE
##     valid_parameters: function (parameters) 
##     welford_state: list
##     write_percentage_log: function (total, completed, stage) 
##     write_trace_to_log_file: function (last_burst_values)  
##   ..$ model    :List of 3
##   .. ..$ dag                 :Classes 'dag_class', 'R6' <dag_class>
##   Public:
##     adjacency_matrix: active binding
##     build_dag: function (greta_array_list) 
##     build_feed_dict: function (dict_list = list(), data_list = self$get_tf_data_list()) 
##     clone: function (deep = FALSE) 
##     compile: TRUE
##     define_batch_size: function () 
##     define_free_state: function (type = c("variable", "placeholder"), name = "free_state") 
##     define_joint_density: function () 
##     define_tf: function (target_nodes = self$node_list) 
##     define_tf_body: function (target_nodes = self$node_list) 
##     define_tf_session: function () 
##     draw_sample: function (distribution_node) 
##     evaluate_density: function (distribution_node, target_node) 
##     example_parameters: function (free = TRUE) 
##     generate_log_prob_function: function (which = c("adjusted", "unadjusted", "both")) 
##     get_tf_data_list: function () 
##     get_tf_names: function (types = NULL) 
##     get_tf_object: function (node) 
##     get_tfp_distribution: function (distrib_node) 
##     hessians: function () 
##     how_to_define: function (node) 
##     how_to_define_all_sampling: function (node) 
##     how_to_define_hybrid: function (node) 
##     initialize: function (target_greta_arrays, tf_float = "float32", compile = FALSE) 
##     log_density: function () 
##     mode: all_forward
##     n_cores: 8
##     new_tf_environment: function () 
##     node_list: list
##     node_tf_names: active binding
##     node_types: active binding
##     on_graph: function (expr) 
##     send_parameters: function (parameters) 
##     set_tf_data_list: function (element_name, value) 
##     split_free_state: function () 
##     subgraph_membership: function () 
##     target_nodes: list
##     tf_environment: environment
##     tf_evaluate_density: function (tfp_distribution, tf_target, truncation = NULL, bounds = NULL) 
##     tf_float: float64
##     tf_graph: tensorflow.python.framework.ops.Graph, python.builtin.object
##     tf_name: function (node) 
##     tf_run: function (expr, as_text = FALSE) 
##     tf_sess_run: function (expr, as_text = FALSE) 
##     trace_names: b[1,1] b[2,1] b[3,1] b[4,1] b[5,1] b[6,1] b[7,1] b[8,1]  ...
##     trace_values: function (free_state, flatten = TRUE, trace_batch_size = Inf) 
##     trace_values_batch: function (free_state_batch) 
##     variables_without_free_state: list 
##   .. ..$ target_greta_arrays :List of 1
##   .. .. ..$ b:'greta_array'  num [1:16, 1] ? ? ? ? ? ? ? ? ? ? ...  .. ..$ visible_greta_arrays:List of 4
##   .. .. ..$ b     :'greta_array'  num [1:16, 1] ? ? ? ? ? ? ? ? ? ? ...  .. .. ..$ deaths:'greta_array'  int [1:560, 1] 4 5 23 23 115 71 272 159 457 232 ...  .. .. ..$ mu    :'greta_array'  num [1:560, 1] ? ? ? ? ? ? ? ? ? ? ...  .. .. ..$ X     :'greta_array'  num [1:560, 1:16] 1 1 1 1 1 1 1 1 1 1 ... 'greta_array'  - attr(*, "assign")= int [1:16] 0 1 2 2 2 2 2 2 2 2 ... 'greta_array'  - attr(*, "contrasts")=List of 1 'greta_array'   ..$ age: chr "contr.treatment"  .. ..- attr(*, "class")= chr "greta_model"

# --- convert to a tibble -------------------------------------
simDF <- greta_to_df(results)
# --- look at the tibble --------------------------------------
print(simDF)

## # A tibble: 3,000 x 18
##    chain  iter  b_1_1  b_2_1 b_3_1 b_4_1 b_5_1 b_6_1 b_7_1 b_8_1 b_9_1 b_10_1
##    <fct> <int>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>
##  1 1         1 -0.527 0.0139 0.292  1.34  2.09  2.65  3.02  3.20  3.29   3.26
##  2 1         2 -0.538 0.0142 0.335  1.34  2.10  2.66  3.04  3.21  3.30   3.27
##  3 1         3 -0.538 0.0141 0.339  1.33  2.10  2.68  3.04  3.22  3.30   3.27
##  4 1         4 -0.535 0.0143 0.330  1.34  2.11  2.65  3.04  3.22  3.30   3.27
##  5 1         5 -0.546 0.0143 0.339  1.35  2.11  2.67  3.05  3.22  3.31   3.28
##  6 1         6 -0.542 0.0141 0.318  1.35  2.10  2.65  3.05  3.22  3.31   3.27
##  7 1         7 -0.526 0.0139 0.324  1.34  2.10  2.65  3.03  3.20  3.29   3.25
##  8 1         8 -0.541 0.0141 0.270  1.34  2.09  2.67  3.03  3.21  3.29   3.26
##  9 1         9 -0.533 0.0142 0.281  1.35  2.09  2.66  3.03  3.21  3.29   3.26
## 10 1        10 -0.545 0.0147 0.313  1.34  2.09  2.64  3.03  3.20  3.29   3.27
## # ... with 2,990 more rows, and 6 more variables: b_11_1 <dbl>, b_12_1 <dbl>,
## #   b_13_1 <dbl>, b_14_1 <dbl>, b_15_1 <dbl>, b_16_1 <dbl>

Appendix

Here are the functions referred to in the post.

Code to clean the data

library(tidyverse)

home     <- "C:/Projects/kaggle/sliced/methods/methods_bayes_software"
filename <- "data/rawData/alcoholspecificdeaths2020.xlsx"

# --- read without column names -------------------------------------
readxl::read_excel( file.path(home, filename), 
                    sheet="Table 2", 
                    range="C6:V385", 
                    col_names=FALSE) %>%
  # --- rename relevant columns -------------------------------------
  rename( year = `...1`,
          age  = `...2`,
          deaths_male = `...10`,
          rate_male = `...11`,
          deaths_female = `...16`,
          rate_female = `...17`) %>%
  # --- select relevant columns -------------------------------------
  select( year, age, deaths_male, rate_male, 
          deaths_female, rate_female) %>%
  # --- rates from character to numeric -----------------------------
  mutate( rate_male = as.numeric(rate_male),
          rate_female = as.numeric(rate_female)) %>%
  # --- drop the children -------------------------------------------
  filter( !(age %in% c("<1", "01-04", "05-09", 
                       "10-14", "15-19"))) %>%
  mutate(age = factor(age)) %>%
  # --- reshape the data --------------------------------------------
  pivot_longer(cols=starts_with("deaths") | starts_with("rate"),
               names_to=c("type", "gender"),
               names_sep = "_") %>% 
  pivot_wider( values_from=value, names_from=type) %>%
  # --- calc population from #deaths & rate per 100,000 -------------
  mutate( pop = round(100000*deaths/rate)/100000 ) %>%
  # --- use average population as annual pop unreliable 
  #     when #deaths is small
  group_by( gender, age ) %>%
  mutate( pop = round(mean(pop, na.rm=TRUE), 2) ) %>% 
  select( -rate) %>%
  saveRDS( file.path(home, "data/rData/alc.rds"))

Convert a BUGS object to a tibble

# --- Function to put bugs simulations in a tibble -----------------
# argument
#    bugsObject ... R Object returned by bugs()
# return
#    tibble of simulations plus chain and iteration number
#
bugs_to_df <- function(bugsObject) {
  vNames <- attr(bugsObject$sims.matrix,"dimnames")[[2]]
  vNames <- str_replace(vNames, "\\[", "_")
  vNames <- str_replace(vNames, ",", "_")
  vNames <- str_replace(vNames, "\\]", "")
  thisDF <- setNames(as_tibble(bugsObject$sims.matrix), vNames)
  nc <- bugsObject$n.chains
  ni <- bugsObject$n.keep 
  thisDF$chain <- factor(rep(1:nc, each=ni))
  thisDF$iter <- rep(1:ni, nc )
  return( thisDF[, c("chain", "iter", vNames)])
}

Convert a nimble object to a tibble

# --- Function to put bugs simulations in a tibble -----------------
# argument
#    nimbleObject ... R Object returned by nimble
# return
#    tibble of simulations plus chain and iteration number
#
nimble_to_df <- function(nimbleObject) {
  if( typeof(nimbleObject) == "list" ) {
    nc <- length(nimbleObject)
    ni <- nrow(nimbleObject$chain1)
    vNames <- attr(nimbleObject$chain1,"dimnames")[[2]]
    thisDF <- NULL
    for( i in 1:nc ) {
      thisDF <- bind_rows(thisDF, as_tibble(nimbleObject[[i]]))
    }
  }
  else {
    nc <- 1
    ni <- nrow(nimbleObject)
    vNames <- attr(nimbleObject,"dimnames")[[2]]
    thisDF <- as_tibble(nimbleObject)
  }
  vNames <- str_replace(vNames, "\\[", "_")
  vNames <- str_replace(vNames, ",", "_")
  vNames <- str_replace(vNames, "\\]", "")

  thisDF <- setNames(thisDF, vNames)
  thisDF$chain <- factor(rep(1:nc, each=ni))
  thisDF$iter <- rep(1:ni, nc )
  return( thisDF[, c("chain", "iter", vNames)])
}

Convert a stan object to a tibble

stan_to_df <- function(stanObject) {

  nPar <- length(stanObject@sim$samples[[1]])
  nc <- length(stanObject@sim$samples)
  ni <- stanObject@sim$iter
  
  vNames <- stanObject@sim$fnames_oi
  vNames <- str_replace(vNames, "\\[", "_")
  vNames <- str_replace(vNames, ",", "_")
  vNames <- str_replace(vNames, "\\]", "")

  thisDF <- NULL
  for( i in 1:nc ) {
    thisDF <- bind_rows(thisDF, as_tibble(as.data.frame(stanObject@sim$samples[[i]])))
  }
  thisDF <- setNames(thisDF, vNames)
  thisDF$chain <- factor(rep(1:nc, each=ni))
  thisDF$iter <- rep(1:ni, nc )
  return( thisDF[, c("chain", "iter", vNames)])
}

Convert a greta object to a tibble

greta_to_df <- function(gretaObject) {

  par  <- attr(gretaObject[[1]], "mcpar")
  nc <- length(gretaObject)
  ni <- par[2]
  
  vNames <- attr(gretaObject[[1]], "dimnames")[[2]]
  vNames <- str_replace(vNames, "\\[", "_")
  vNames <- str_replace(vNames, ",", "_")
  vNames <- str_replace(vNames, "\\]", "")

  thisDF <- NULL
  for( i in 1:nc ) {
    thisDF <- bind_rows(thisDF, as_tibble(as.data.frame(gretaObject[[i]])))
  }
  thisDF <- setNames(thisDF, vNames)
  thisDF$chain <- factor(rep(1:nc, each=ni))
  thisDF$iter <- rep(1:ni, nc )
  return( thisDF[, c("chain", "iter", vNames)])
}

Read a set of rds files containing Bayesian simulations

The function determines the source of the simulations from the class of the object in the file.

bayes_to_df <- function(files) {
  
  ichain <- 0
  S <- NULL
  nfiles <- length(files)
  for( f in 1:nfiles) {
 
    object <- readRDS(files[f])
    if( class(object)[1] == "matrix") {
      df <- nimble_to_df(object)
    } else if( class(object)[1] == "bugs") {
      df <- bugs_to_df(object)
    }
      df$chain <- as.numeric(df$chain)
      nchain <- max(df$chain)
      df$chain <- ichain + df$chain
      ichain <- ichain + nchain
      S <- rbind(S, df)
  }
  if( ichain > 1) {
    as_tibble(S) %>% mutate( chain = factor(chain))
  } else {
    as_tibble(S)
  }
}

Modelling with R

contrasting statistical and machine learning approaches

Methods: R Software for Bayesian Analysis

Introduction

Overview of this Post

The data

BUGS Code

Preparing the data

Initial values

Model fitting in OpenBUGS

Accessing the Results in R

Multiple Chains

Parallel Processing

MultiBUGS

Multiple Cores

nimble

Parallel Nimble

Stan

Running RStan

Saving the model code

Preparing the data

Preparing the initial values

Running the sampler

Greta

Appendix

Code to clean the data

Convert a BUGS object to a tibble

Convert a nimble object to a tibble

Convert a stan object to a tibble

Convert a greta object to a tibble

Read a set of rds files containing Bayesian simulations