data { int<lower=0> N; // number of observations int<lower=0> P; // number of predictors (plus column for intercept) matrix[N, P] X; //model.matrix() from R vector[N] y; // outcome vector[P] meanBeta; // prior mean vector for coefficients matrix[P, P] covBeta; // prior covariance matrix for coefficients real sigmaRate; // prior rate parameter for residual standard deviation}
Syntax Changes: Parameters Section
Old Syntax Without Matrices
parameters { real beta0; real betaHeight; real betaGroup2; real betaGroup3; real betaHxG2; real betaHxG3; real<lower=0> sigma;}
New Syntax With Matrices
parameters { vector[P] beta; // vector of coefficients for Beta real<lower=0> sigma; // residual standard deviation}
Defining Prior Distributions
Previously, we defined a normal prior distribution for each regression coefficient
Univariate priors – univariate normal distribution
Each parameter had a prior that was independent of the other parameters
When combining all parameters into a vector, a natural extension is a multivariate normal distribution
model { beta ~multi_normal(meanBeta, covBeta); // prior for coefficients sigma ~exponential(sigmaRate); // prior for sigma y ~normal(X*beta, sigma); // linear model}
See: Example Syntax in R File
Summary of Changes
With matrices, there is less syntax to write
Model is equivalent
Output, however, is not labeled with respect to parameters
Here, \(\beta_1\) is the change in \(\text{WeightLB}_p\) per one-unit change in \(\text{HeightIN}_p\) for a person in Diet Group 1 (i.e. _p and \(\text{Group3}_p=0\))
Question: What is the slope for Diet Group 2?
To answer, we need to first form the model when \(\text{Group2}_p = 1\): \[\text{WeightLB}_p = \beta_0 + \beta_1\text{HeightIN}_p + \beta_2 + \beta_4\text{HeightIN}_p + e_p\]
Next, we rearrange terms that involve \(\text{HeightIN}_p\): \[\text{WeightLB}_p = (\beta_0 + \beta_2) + (\beta_1 + \beta_4)\text{HeightIN}_p + e_p\]
From here, we can see the slope for Diet Group 2 is \((\beta_1 + \beta_4)\)
Also, the intercept for Diet Group 2 is \((\beta_0 + \beta_2)\)
Computing Slope for Diet Group 2
Our task: Create posterior distribution for Diet Group 2
We must do so for each iteration we’ve kept from our MCMC chain
A somewhat tedious way to do this is after using Stan
Stan can compute these values for us–with the “generated quantities” section of the syntax
generated quantities{ real slopeG2; slopeG2 = betaHeight + betaHxG2;}
The generated quantities block computes values that do not affect the posterior distributions of the parameters–they are computed after the sampling from each iteration
The values are then added to the Stan object and can be seen in the summary
They can also be plotted using the bayesplot package
The contract matrix then multipies the coefficents vector to form the values:
\[\textbf{C} \boldsymbol{\beta}\]
Contrasts in Stan
We can change our Stan code to import a contrast matrix and use it in generated quantities:
data { int<lower=0> N; // number of observations int<lower=0> P; // number of predictors (plus column for intercept) matrix[N, P] X; //model.matrix() from R vector[N] y; // outcome vector[P] meanBeta; // prior mean vector for coefficients matrix[P, P] covBeta; // prior covariance matrix for coefficients real sigmaRate; // prior rate parameter for residual standard deviation int<lower=0> nContrasts; matrix[nContrasts,P] contrast; // contrast matrix for additional effects}
Notice: RSS depends on sampled parameters–so we will use this to build our posterior distribution for \(R^2\)
Computing \(R^2\) in Stan
The generated quantities block can do everything we need to compute \(R^2\)
generated quantities { vector[nContrasts] heightSlopeG2; real rss; real totalrss; heightSlopeG2 = contrast*beta; { // anything in these brackets will not appear in summary vector[N] pred; pred = X*beta; rss =dot_self(y-pred); // dot_self is a stan function totalrss =dot_self(y-mean(y)); } real R2; R2 =1-rss/totalrss;}
See the example syntax for a demonstration
Wrapping Up
Today we further added to our Bayesian toolset:
How to make Stan use less syntax using matrices
How to form posterior distributions for functions of parameters
We will use both of these features in psychometric models
Up Next
We have one more lecture on linear models that will introduce
Methods for relative model comparisons
Methods for checking the absolute fit of a model
Then all things we have discussed to this point will be used in our psychometric models