Select parameters from a single (sampling) iteration of the Markov chain

Using the selected parameters and the model, simulate a data set with the same size (number of observations/variables)

From the simulated data set, calculate selected summary statistics (e.g. mean)

Repeat steps 1-3 for a fixed number of iterations (perhaps across the whole chain)

When done, compare position of observed summary statistics to that of the distribution of summary statitsics from simulated data sets (predictive distribution)

Example PPMC

For our model, we have one observed variable that is in the model (\(Y_p\))

Note, the observations in \(\textbf{X}\) are not modeled, so we do not examine these

First, let’s assemble the posterior draws:

# here, we use format = "draws_matrix" to remove the draws from the array format they default toposteriorSample = model06_Samples$draws(variables =c("beta", "sigma"), format ="draws_matrix")posteriorSample

C is a constant that cancels out when model comparisons are made

Here,

\(\overline{D\left(\theta\right)}\) is the average log likelihood of the data (\(y\)) given the parameters (\(\theta\)) computed across all samples

\(D\left(\bar{\theta}\right)\) is the log likelihood of the data (\(y\)) computed at the average of the parameters (\(\bar{\theta}\)) computed across all samples

Newer Methods

The DIC has fallen out of favor recently

Has issues when parameters are discrete

Not fully Bayesian (point estimate of average of parameter values)

Can give negative values for estimated numbers of parameters in a model

WAIC (Widely applicable or Watanabe-Akaike information criterion, Watanabe, 2010) corrects some of the problems with DIC:

LOO: Leave One Out via Pareto Smoothed Importance Sampling

More recently, approximations to LOO have gained popularity

LOO via Pareto Smoothed Importance Sampling attempts to approximate the process of leave-one-out cross-validation using a sampling based-approach

Gives a finite-sample approximation

Implemented in Stan

Can quickly compare models

Gives warnings when it may be less reliable to use

The details are very technical, but are nicely compiled in Vehtari, Gelman, and Gabry (2017) (see preprint on arXiv at https://arxiv.org/pdf/1507.04544.pdf)

Big picture:

Can compute WAIC and/or LOO via Stan’s loo package

LOO also gives variability for model comparisons

Model Comparisons via Stan

The core of WAIC and LOO is the log-likelihood of the data conditional on the model parameters, as calculated for each observation in the sample of the model:

We can calculate this using the generated quantities block in Stan:

generated quantities{// general quantities used below: vector[N] y_pred; y_pred = X*beta; // predicted value (conditional mean)// WAIC and LOO for model comparison array[N] real log_lik;for (person in1:N){ log_lik[person] =normal_lpdf(y[person] | y_pred[person], sigma); }}

WAIC in Stan

Using the loo package, we can calculate WAIC with the waic() function

waic(model06_Samples$draws("log_lik"))

Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -109.7 6.9
p_waic 6.8 2.7
waic 219.3 13.7
4 (13.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.

Here:

elpd_waic is the expected log pointwise predictive density for WAIC

p_waic is the WAIC calculation of number of model parameters (a penalty to the likelihood for more parameters)

waic is the WAIC index used for model comparisons (lowest value is best fitting; -2*elpd_waic)

Note: WAIC needs a “log_lik” variable in the model analysis to be calculated correctly * That is up to you to provide!!

LOO in Stan

Using the loo package, we can calculate the PSIS-LOO using cmdstanr objects with the $loo() function:

model06_Samples$loo()

Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_loo -110.3 7.2
p_loo 7.5 3.1
looic 220.6 14.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 26 86.7% 1476
(0.5, 0.7] (ok) 3 10.0% 433
(0.7, 1] (bad) 1 3.3% 33
(1, Inf) (very bad) 0 0.0% <NA>
See help('pareto-k-diagnostic') for details.

Note: LOO needs a “log_lik” variable in the model analysis to be calculated correctly

That is up to you to provide!!

If named “log_lik” then you don’t need to provide the function an argument

Notes about LOO

model06_Samples$loo()

Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_loo -110.3 7.2
p_loo 7.5 3.1
looic 220.6 14.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 26 86.7% 1476
(0.5, 0.7] (ok) 3 10.0% 433
(0.7, 1] (bad) 1 3.3% 33
(1, Inf) (very bad) 0 0.0% <NA>
See help('pareto-k-diagnostic') for details.

Here:

elpd_loo is the expected log pointwise predictive density for LOO

p_loo is the LOO calculation of number of model parameters (a penalty to the likelihood for more parameters)

looic is the LOO index used for model comparisons (lowest value is best fitting; -2*elpd_loo)

Also, you get a warning if to many of your sampled values have bad diagnostic values

To compare models with WAIC, take the one with the lowest value:

waic(model06_Samples$draws("log_lik"))

Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -109.7 6.9
p_waic 6.8 2.7
waic 219.3 13.7
4 (13.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.

waic(model06b_Samples$draws("log_lik"))

Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -370.2 30.9
p_waic 17.2 4.1
waic 740.3 61.8
12 (40.0%) p_waic estimates greater than 0.4. We recommend trying loo instead.

Comparing Models with LOO

To compare models with LOO, the loo package has a built-in comparison function:

# comparing two models with loo:loo_compare(list(uniformative = model06_Samples$loo(), informative = model06b_Samples$loo()))