Select parameters from a single (sampling) iteration of the Markov chain
Using the selected parameters and the model, simulate a data set with the same size (number of observations/variables)
From the simulated data set, calculate selected summary statistics (e.g. mean)
Repeat steps 1-3 for a fixed number of iterations (perhaps across the whole chain)
When done, compare position of observed summary statistics to that of the distribution of summary statitsics from simulated data sets (predictive distribution)
Example PPMC
For our model, we have one observed variable that is in the model (\(Y_p\))
Note, the observations in \(\textbf{X}\) are not modeled, so we do not examine these
First, let’s assemble the posterior draws:
# here, we use format = "draws_matrix" to remove the draws from the array format they default toposteriorSample = model06_Samples$draws(variables =c("beta", "sigma"), format ="draws_matrix")posteriorSample
C is a constant that cancels out when model comparisons are made
Here,
\(\overline{D\left(\theta\right)}\) is the average log likelihood of the data (\(y\)) given the parameters (\(\theta\)) computed across all samples
\(D\left(\bar{\theta}\right)\) is the log likelihood of the data (\(y\)) computed at the average of the parameters (\(\bar{\theta}\)) computed across all samples
Newer Methods
The DIC has fallen out of favor recently
Has issues when parameters are discrete
Not fully Bayesian (point estimate of average of parameter values)
Can give negative values for estimated numbers of parameters in a model
WAIC (Widely applicable or Watanabe-Akaike information criterion, Watanabe, 2010) corrects some of the problems with DIC:
LOO: Leave One Out via Pareto Smoothed Importance Sampling
More recently, approximations to LOO have gained popularity
LOO via Pareto Smoothed Importance Sampling attempts to approximate the process of leave-one-out cross-validation using a sampling based-approach
Gives a finite-sample approximation
Implemented in Stan
Can quickly compare models
Gives warnings when it may be less reliable to use
The details are very technical, but are nicely compiled in Vehtari, Gelman, and Gabry (2017) (see preprint on arXiv at https://arxiv.org/pdf/1507.04544.pdf)
Big picture:
Can compute WAIC and/or LOO via Stan’s loo package
LOO also gives variability for model comparisons
Model Comparisons via Stan
The core of WAIC and LOO is the log-likelihood of the data conditional on the model parameters, as calculated for each observation in the sample of the model:
We can calculate this using the generated quantities block in Stan:
generated quantities{// general quantities used below: vector[N] y_pred; y_pred = X*beta; // predicted value (conditional mean)// WAIC and LOO for model comparison array[N] real log_lik;for (person in1:N){ log_lik[person] =normal_lpdf(y[person] | y_pred[person], sigma); }}
WAIC in Stan
Using the loo package, we can calculate WAIC with the waic() function
waic(model06_Samples$draws("log_lik"))
Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -109.7 6.9
p_waic 6.8 2.7
waic 219.3 13.7
4 (13.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
Here:
elpd_waic is the expected log pointwise predictive density for WAIC
p_waic is the WAIC calculation of number of model parameters (a penalty to the likelihood for more parameters)
waic is the WAIC index used for model comparisons (lowest value is best fitting; -2*elpd_waic)
Note: WAIC needs a “log_lik” variable in the model analysis to be calculated correctly * That is up to you to provide!!
LOO in Stan
Using the loo package, we can calculate the PSIS-LOO using cmdstanr objects with the $loo() function:
model06_Samples$loo()
Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_loo -110.3 7.2
p_loo 7.5 3.1
looic 220.6 14.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 26 86.7% 1476
(0.5, 0.7] (ok) 3 10.0% 433
(0.7, 1] (bad) 1 3.3% 33
(1, Inf) (very bad) 0 0.0% <NA>
See help('pareto-k-diagnostic') for details.
Note: LOO needs a “log_lik” variable in the model analysis to be calculated correctly
That is up to you to provide!!
If named “log_lik” then you don’t need to provide the function an argument
Notes about LOO
model06_Samples$loo()
Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_loo -110.3 7.2
p_loo 7.5 3.1
looic 220.6 14.4
------
Monte Carlo SE of elpd_loo is NA.
Pareto k diagnostic values:
Count Pct. Min. n_eff
(-Inf, 0.5] (good) 26 86.7% 1476
(0.5, 0.7] (ok) 3 10.0% 433
(0.7, 1] (bad) 1 3.3% 33
(1, Inf) (very bad) 0 0.0% <NA>
See help('pareto-k-diagnostic') for details.
Here:
elpd_loo is the expected log pointwise predictive density for LOO
p_loo is the LOO calculation of number of model parameters (a penalty to the likelihood for more parameters)
looic is the LOO index used for model comparisons (lowest value is best fitting; -2*elpd_loo)
Also, you get a warning if to many of your sampled values have bad diagnostic values
To compare models with WAIC, take the one with the lowest value:
waic(model06_Samples$draws("log_lik"))
Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -109.7 6.9
p_waic 6.8 2.7
waic 219.3 13.7
4 (13.3%) p_waic estimates greater than 0.4. We recommend trying loo instead.
waic(model06b_Samples$draws("log_lik"))
Computed from 8000 by 30 log-likelihood matrix
Estimate SE
elpd_waic -370.2 30.9
p_waic 17.2 4.1
waic 740.3 61.8
12 (40.0%) p_waic estimates greater than 0.4. We recommend trying loo instead.
Comparing Models with LOO
To compare models with LOO, the loo package has a built-in comparison function:
# comparing two models with loo:loo_compare(list(uniformative = model06_Samples$loo(), informative = model06b_Samples$loo()))