Example data: 635 older adults (age 80-100) self-reporting on 7 items assessing the Instrumental Activities of Daily Living (IADL) as follows:
Two versions of a response format were available:
Binary -> 0 = “needs help”, 1 = “does not need help”
Categorical -> 0 = “can’t do it”, 1=”big problems”, 2=”some problems”, 3=”no problems”
Higher scores indicate greater function. We will look at each response format in turn.
if (require(lavaan) == FALSE){
install.packages("lavaan")
}
library(lavaan)
if (require(mirt) == FALSE){
install.packages("mirt")
}
library(mirt)
if (require(ggplot2) == FALSE){
install.packages("ggplot2")
}
library(ggplot2)
The data are in a text file named adl.dat
orignally used
in Mplus (so no column names were included at the top of the file). The
file contains more items than we will use, so we select only items above
from the whole file.
#read in data file (Mplus file format so having to label columns)
adlData = read.table(file = "adl.dat", header = FALSE, na.strings = ".", col.names = c("case", paste0("dpa", 1:14), paste0("dia", 1:7), paste0("cpa", 1:14), paste0("cia", 1:7)))
#select Situations items and PersonID variables
iadlDataInit = adlData[c(paste0("dia", 1:7))]
#remove cases with all items missing
removeCases = which(apply(X = iadlDataInit, MARGIN = 1, FUN = function (x){ length(which(is.na(x)))}) == 7)
iadlData = iadlDataInit[-removeCases,]
We will introduce the mirt
package as a method for
estimating IRT models. Overall, the package is very good, but typically
is used for scaling purposes (measurement rather than use of latent
variables in additional model equations). We use the package to
demonstrate estimating IRT models using marginal maximum likelihood. If
you wish to use the latent trait estimates in secondary analyses (which
you would otherwise use SEM for simultaneously), there are additional
steps to take to ensure the error associated with each score is carried
over to the subsequent analysese
When all the items of the model are the same type, the
mirt
syntax is very short. The mirt()
function
is used to provide estimates, with options model=1
for all
items measuring the same trait and itemtype="2PL"
for the
two-parameter logistic model ("Rasch"
is used for the
1PL
shorthand). The "Rasch"
designation
estimates a model where the loadings are all set to one and the
factor/latent trait variance is estimated) – which is an equivalent
model to the one estimated below but we seek to keep the latent trait
standardized. We will estimate both simultaneously here:
mirt1PLsyntax = "
IADL = 1-7
CONSTRAIN = (1-7, a1)
COV = 1
"
model1PLmirt = mirt(data = iadlData, model = mirt1PLsyntax)
Iteration: 1, Log-Lik: -1878.487, Max-Change: 1.89501
Iteration: 2, Log-Lik: -1577.569, Max-Change: 1.18305
Iteration: 3, Log-Lik: -1505.605, Max-Change: 0.58836
Iteration: 4, Log-Lik: -1483.057, Max-Change: 0.34308
Iteration: 5, Log-Lik: -1474.066, Max-Change: 0.23870
Iteration: 6, Log-Lik: -1469.990, Max-Change: 0.16868
Iteration: 7, Log-Lik: -1466.436, Max-Change: 0.06185
Iteration: 8, Log-Lik: -1465.995, Max-Change: 0.04630
Iteration: 9, Log-Lik: -1465.692, Max-Change: 0.03439
Iteration: 10, Log-Lik: -1465.184, Max-Change: 0.01246
Iteration: 11, Log-Lik: -1465.079, Max-Change: 0.01223
Iteration: 12, Log-Lik: -1464.992, Max-Change: 0.01176
Iteration: 13, Log-Lik: -1464.662, Max-Change: 0.00863
Iteration: 14, Log-Lik: -1464.646, Max-Change: 0.00732
Iteration: 15, Log-Lik: -1464.632, Max-Change: 0.00642
Iteration: 16, Log-Lik: -1464.579, Max-Change: 0.00288
Iteration: 17, Log-Lik: -1464.576, Max-Change: 0.00252
Iteration: 18, Log-Lik: -1464.574, Max-Change: 0.00216
Iteration: 19, Log-Lik: -1464.569, Max-Change: 0.00417
Iteration: 20, Log-Lik: -1464.568, Max-Change: 0.00277
Iteration: 21, Log-Lik: -1464.567, Max-Change: 0.00138
Iteration: 22, Log-Lik: -1464.566, Max-Change: 0.00109
Iteration: 23, Log-Lik: -1464.566, Max-Change: 0.00161
Iteration: 24, Log-Lik: -1464.565, Max-Change: 0.00118
Iteration: 25, Log-Lik: -1464.565, Max-Change: 0.00168
Iteration: 26, Log-Lik: -1464.564, Max-Change: 0.00142
Iteration: 27, Log-Lik: -1464.564, Max-Change: 0.00099
Iteration: 28, Log-Lik: -1464.564, Max-Change: 0.00104
Iteration: 29, Log-Lik: -1464.563, Max-Change: 0.00102
Iteration: 30, Log-Lik: -1464.563, Max-Change: 0.00086
Iteration: 31, Log-Lik: -1464.563, Max-Change: 0.00082
Iteration: 32, Log-Lik: -1464.563, Max-Change: 0.00074
Iteration: 33, Log-Lik: -1464.563, Max-Change: 0.00070
Iteration: 34, Log-Lik: -1464.563, Max-Change: 0.00076
Iteration: 35, Log-Lik: -1464.563, Max-Change: 0.00065
Iteration: 36, Log-Lik: -1464.563, Max-Change: 0.00074
Iteration: 37, Log-Lik: -1464.563, Max-Change: 0.00051
Iteration: 38, Log-Lik: -1464.563, Max-Change: 0.00065
Iteration: 39, Log-Lik: -1464.563, Max-Change: 0.00048
Iteration: 40, Log-Lik: -1464.563, Max-Change: 0.00049
Iteration: 41, Log-Lik: -1464.563, Max-Change: 0.00039
Iteration: 42, Log-Lik: -1464.563, Max-Change: 0.00031
Iteration: 43, Log-Lik: -1464.563, Max-Change: 0.00010
Iteration: 44, Log-Lik: -1464.563, Max-Change: 0.00009
model2PLmirt = mirt(data = iadlData, model = 1, itemtype = "2PL")
Iteration: 1, Log-Lik: -1878.487, Max-Change: 1.36006
Iteration: 2, Log-Lik: -1572.925, Max-Change: 1.11015
Iteration: 3, Log-Lik: -1499.005, Max-Change: 0.91115
Iteration: 4, Log-Lik: -1474.055, Max-Change: 0.72471
Iteration: 5, Log-Lik: -1464.621, Max-Change: 0.54473
Iteration: 6, Log-Lik: -1460.408, Max-Change: 0.40623
Iteration: 7, Log-Lik: -1456.776, Max-Change: 0.16294
Iteration: 8, Log-Lik: -1456.384, Max-Change: 0.13300
Iteration: 9, Log-Lik: -1456.118, Max-Change: 0.10161
Iteration: 10, Log-Lik: -1455.651, Max-Change: 0.03754
Iteration: 11, Log-Lik: -1455.561, Max-Change: 0.02954
Iteration: 12, Log-Lik: -1455.486, Max-Change: 0.02638
Iteration: 13, Log-Lik: -1455.201, Max-Change: 0.01568
Iteration: 14, Log-Lik: -1455.180, Max-Change: 0.01315
Iteration: 15, Log-Lik: -1455.164, Max-Change: 0.01114
Iteration: 16, Log-Lik: -1455.117, Max-Change: 0.00739
Iteration: 17, Log-Lik: -1455.110, Max-Change: 0.00634
Iteration: 18, Log-Lik: -1455.105, Max-Change: 0.00517
Iteration: 19, Log-Lik: -1455.092, Max-Change: 0.00484
Iteration: 20, Log-Lik: -1455.089, Max-Change: 0.00409
Iteration: 21, Log-Lik: -1455.087, Max-Change: 0.00363
Iteration: 22, Log-Lik: -1455.084, Max-Change: 0.00367
Iteration: 23, Log-Lik: -1455.082, Max-Change: 0.00288
Iteration: 24, Log-Lik: -1455.081, Max-Change: 0.00267
Iteration: 25, Log-Lik: -1455.078, Max-Change: 0.00262
Iteration: 26, Log-Lik: -1455.078, Max-Change: 0.00195
Iteration: 27, Log-Lik: -1455.077, Max-Change: 0.00180
Iteration: 28, Log-Lik: -1455.075, Max-Change: 0.00088
Iteration: 29, Log-Lik: -1455.075, Max-Change: 0.00063
Iteration: 30, Log-Lik: -1455.075, Max-Change: 0.00065
Iteration: 31, Log-Lik: -1455.074, Max-Change: 0.00024
Iteration: 32, Log-Lik: -1455.074, Max-Change: 0.00014
Iteration: 33, Log-Lik: -1455.074, Max-Change: 0.00013
Iteration: 34, Log-Lik: -1455.074, Max-Change: 0.00073
Iteration: 35, Log-Lik: -1455.074, Max-Change: 0.00016
Iteration: 36, Log-Lik: -1455.074, Max-Change: 0.00048
Iteration: 37, Log-Lik: -1455.074, Max-Change: 0.00010
Unlike lavaan
, mirt
does not provide a nice
formatting of parameters with the summary statment. Rather, we get parts
of estimates through various pieces.
The model log-likelihood and summary information is given by the
show()
function:
show(model1PLmirt)
Call:
mirt(data = iadlData, model = mirt1PLsyntax)
Full-information item factor analysis with 1 factor(s).
Converged within 1e-04 tolerance after 44 EM iterations.
mirt version: 1.44.0
M-step optimizer: BFGS
EM acceleration: Ramsay
Number of rectangular quadrature: 61
Latent density type: Gaussian
Log-likelihood = -1464.563
Estimated parameters: 14
AIC = 2945.125
BIC = 2980.754; SABIC = 2955.355
show(model2PLmirt)
Call:
mirt(data = iadlData, model = 1, itemtype = "2PL")
Full-information item factor analysis with 1 factor(s).
Converged within 1e-04 tolerance after 37 EM iterations.
mirt version: 1.44.0
M-step optimizer: BFGS
EM acceleration: Ramsay
Number of rectangular quadrature: 61
Latent density type: Gaussian
Log-likelihood = -1455.074
Estimated parameters: 14
AIC = 2938.149
BIC = 3000.499; SABIC = 2956.051
Also note that the model log-likelihood information does not include a test of the model against an alternative, as does a typical CFA analysis in comparing the model fit of your model to one where all parameters were estimated. This is because the saturated model in IRT is different (for models where all items are binary, it is Multivariate Bernoulli) in that the statistics of interest come in the form of the proportion of people with a given response pattern.
To see estimates, use the coef()
function. Here are the
estimates for the 1PL model:
coef1PL = coef(model1PLmirt)
coef1PL
$dia1
a1 d g u
par 4.39 1.637 0 1
$dia2
a1 d g u
par 4.39 4.651 0 1
$dia3
a1 d g u
par 4.39 3.509 0 1
$dia4
a1 d g u
par 4.39 1.908 0 1
$dia5
a1 d g u
par 4.39 1.881 0 1
$dia6
a1 d g u
par 4.39 2.988 0 1
$dia7
a1 d g u
par 4.39 7.467 0 1
$GroupPars
MEAN_1 COV_11
par 0 1
The coef()
function returns an R list of the parameters
for each item along with the structural model parameters (the
$GroupPars
element), which shows the mean and variance of
the latent variable. For each item, there are at least four parameters
listed:
a1
: the factor loading/item discrimination/slope: the
change in the log-odds of \(P(Y_{si}=1)\) per one unit change in \(\theta_s\) (approximate change in the 3/4PL
models)d
: the item intercept the expected value of the
log-odds of \(P(Y_{si}=1)\) for \(\theta_s = 0\) (approximate expectation in
the 3/4PL models)g
: the lower asymptote or guessing parameter (for 3PL
this will be non-zero)u
: the upper asymptote (for 4PL
or, in
mirt, a type of a 3PL, this will be non-zero)Note how the item discrimination (the a1
term) is equal
for all items – this is done by convention in the 1PL model.
Putting the parameters into equation form, we have a slope/intercept form of the IRT model:
\[P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(d_i + a1_i \theta_s \right)}{1+\exp\left(d_i + a1_i \theta_s \right)}\]
Another commonly used parameterization of the IRT model is called discrimination/difficulty, given by:
\[P(Y_{si} = 1 | \theta_s) = g_i + (u_i-g_i)\frac{\exp\left(a1_i \left( \theta_s - b_i \right) \right)}{1+\exp\left(a1_i \left( \theta_s - b_i \right) \right)}\]
The two parameterizations are equivalent and one can be found by re-arranging terms of the other. To get the item difficulty from the slope/intercept parameterization:
\[b_i = -\frac{d_i}{a1_i}\]
For our results, we can use the lapply
function to add
the item difficulties:
getDifficulty = function(itemPar){
parnames = colnames(itemPar)
if ("a1" %in% parnames){
itemPar = c(itemPar, -1*itemPar[2]/itemPar[1])
names(itemPar) = c(parnames, "b")
return(itemPar)
} else {
return(itemPar)
}
}
lapply(X = coef1PL, FUN = getDifficulty)
$dia1
a1 d g u b
4.3895702 1.6372853 0.0000000 1.0000000 -0.3729944
$dia2
a1 d g u b
4.389570 4.651017 0.000000 1.000000 -1.059561
$dia3
a1 d g u b
4.3895702 3.5086672 0.0000000 1.0000000 -0.7993191
$dia4
a1 d g u b
4.3895702 1.9084551 0.0000000 1.0000000 -0.4347704
$dia5
a1 d g u b
4.389570 1.881067 0.000000 1.000000 -0.428531
$dia6
a1 d g u b
4.3895702 2.9876548 0.0000000 1.0000000 -0.6806258
$dia7
a1 d g u b
4.389570 7.466592 0.000000 1.000000 -1.700985
$GroupPars
MEAN_1 COV_11
par 0 1
itemPar = coef1PL[[1]]
For the 2PL, we can use a similar method (here condensed to display the item difficulties):
coef2PL = lapply(X = coef(model2PLmirt), FUN = getDifficulty)
coef2PL
$dia1
a1 d g u b
4.373935 1.606529 0.000000 1.000000 -0.367296
$dia2
a1 d g u b
5.058002 5.227547 0.000000 1.000000 -1.033520
$dia3
a1 d g u b
4.3647262 3.4551388 0.0000000 1.0000000 -0.7916049
$dia4
a1 d g u b
7.197112 2.944432 0.000000 1.000000 -0.409113
$dia5
a1 d g u b
4.273968 1.807076 0.000000 1.000000 -0.422810
$dia6
a1 d g u b
3.4634209 2.4201714 0.0000000 1.0000000 -0.6987806
$dia7
a1 d g u b
3.303801 5.952140 0.000000 1.000000 -1.801604
$GroupPars
MEAN_1 COV_11
par 0 1
As the 1PL is nested within the 2PL, we can use a likelihood ratio test to see which model is preferred. The LRT tests the null hypothesis that all item discriminations are equal against an alternative that not all are equal:
anova(model1PLmirt, model2PLmirt)
Here, the test statistic was \(\chi_6 = 18.977\) with a p-value of .004. Therefore, we reject the null hypothesis of equal slopes and conclude the 2PL fits better than the 1PL model.
The LRT, however, assumes both models have a sufficient level of
absolute fit to the data. One way to tell is the use of the
M2()
function, which provides model fit to the 2-way tables
(think item-pair covariances). Because our data has some missing
responses, we have to use the impute=10
option, imputing 10
values per missing response. Here is the value for the 1PL:
M2(obj = model1PLmirt, impute = 10)
M2(obj = model2PLmirt, impute=10)
The statistics given from the M2 function are similar to those used
in CFA–these show approximate model fit indices such as RMSEA, SRMR,
TLI, and CFI. From these, it appears the model fits approximately (CFI
and TLI near 1 but relatively poor RMSEA). To find misfitting
“residuals” we need complete data and the function M2()
and
the imputeMissing()
functions are not working. So, here is
an example with complete data and the 2PL:
model2PLmirtB = mirt(data = iadlData[complete.cases(iadlData),], model = 1, itemtype = "2PL")
Iteration: 1, Log-Lik: -1817.248, Max-Change: 1.46767
Iteration: 2, Log-Lik: -1517.948, Max-Change: 1.03243
Iteration: 3, Log-Lik: -1447.274, Max-Change: 1.00582
Iteration: 4, Log-Lik: -1424.072, Max-Change: 0.74506
Iteration: 5, Log-Lik: -1415.121, Max-Change: 0.55113
Iteration: 6, Log-Lik: -1411.156, Max-Change: 0.39934
Iteration: 7, Log-Lik: -1407.784, Max-Change: 0.17155
Iteration: 8, Log-Lik: -1407.439, Max-Change: 0.13233
Iteration: 9, Log-Lik: -1407.206, Max-Change: 0.10162
Iteration: 10, Log-Lik: -1406.798, Max-Change: 0.03483
Iteration: 11, Log-Lik: -1406.725, Max-Change: 0.02867
Iteration: 12, Log-Lik: -1406.663, Max-Change: 0.02278
Iteration: 13, Log-Lik: -1406.425, Max-Change: 0.01418
Iteration: 14, Log-Lik: -1406.410, Max-Change: 0.01160
Iteration: 15, Log-Lik: -1406.397, Max-Change: 0.00953
Iteration: 16, Log-Lik: -1406.352, Max-Change: 0.00510
Iteration: 17, Log-Lik: -1406.348, Max-Change: 0.00443
Iteration: 18, Log-Lik: -1406.344, Max-Change: 0.00389
Iteration: 19, Log-Lik: -1406.331, Max-Change: 0.00247
Iteration: 20, Log-Lik: -1406.330, Max-Change: 0.00214
Iteration: 21, Log-Lik: -1406.330, Max-Change: 0.00209
Iteration: 22, Log-Lik: -1406.327, Max-Change: 0.00080
Iteration: 23, Log-Lik: -1406.327, Max-Change: 0.00070
Iteration: 24, Log-Lik: -1406.327, Max-Change: 0.00071
Iteration: 25, Log-Lik: -1406.326, Max-Change: 0.00055
Iteration: 26, Log-Lik: -1406.326, Max-Change: 0.00021
Iteration: 27, Log-Lik: -1406.326, Max-Change: 0.00013
Iteration: 28, Log-Lik: -1406.326, Max-Change: 0.00075
Iteration: 29, Log-Lik: -1406.326, Max-Change: 0.00013
Iteration: 30, Log-Lik: -1406.326, Max-Change: 0.00060
Iteration: 31, Log-Lik: -1406.326, Max-Change: 0.00053
Iteration: 32, Log-Lik: -1406.326, Max-Change: 0.00019
Iteration: 33, Log-Lik: -1406.326, Max-Change: 0.00053
Iteration: 34, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 35, Log-Lik: -1406.326, Max-Change: 0.00037
Iteration: 36, Log-Lik: -1406.326, Max-Change: 0.00017
Iteration: 37, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 38, Log-Lik: -1406.326, Max-Change: 0.00041
Iteration: 39, Log-Lik: -1406.326, Max-Change: 0.00038
Iteration: 40, Log-Lik: -1406.326, Max-Change: 0.00015
Iteration: 41, Log-Lik: -1406.326, Max-Change: 0.00038
Iteration: 42, Log-Lik: -1406.326, Max-Change: 0.00008
M2(obj = model2PLmirtB)
M2(obj = model2PLmirtB, residmat = TRUE)
dia1 dia2 dia3 dia4 dia5 dia6 dia7
dia1 NA NA NA NA NA NA NA
dia2 0.02663977 NA NA NA NA NA NA
dia3 0.06240477 0.06313870 NA NA NA NA NA
dia4 -0.02758525 -0.02668132 -0.038895907 NA NA NA NA
dia5 -0.02215675 -0.01360095 -0.080263677 0.04226633 NA NA NA
dia6 -0.03183329 -0.04414976 -0.029552955 0.02706127 0.03628025 NA NA
dia7 -0.02704504 -0.01200917 0.007826515 0.01542917 -0.00691414 0.02537108 NA
Here we see the biggest descripancy of residual covariances is that for dia5 with dia3 at -.08.
Finally, we can see plots of our model (all shown for the 2PL model). First, the item characteristic curves
plot(model2PLmirt, item=1, type = "trace", theta_lim = c(-3,3))
Next we can see the test information plot:
plot(model2PLmirt, type = "info", theta_lim = c(-3,3))
We can see that our test information peaks around a theta of -.5, meaning scores near -.5 will be the most reliable.
Finally, we can use the fscores()
function to get the
estimated trait scores. Note, there are several types of scores
available. The standard used for score reporting is
method = "EAP"
, which are scores that use the expected
value of the posterior distribution of the score. For doing secondary
analyses, multiple “plausible” scores should be used with the option
plausible.draws = #
where # is the number of scores to
draw. We plot the density of the test scores following estimating them
with fscores()
:
theta = fscores(object = model2PLmirt, method = "EAP")
hist(theta)
The plot shows a number of people with scores at the maximum value – but very few around the most reliable portion of the test, -.5
Here, we will plot the scores along with the standard error of the scores to show the relationship:
theta2 = fscores(object = model2PLmirt, method = "EAP", full.scores.SE = TRUE)
plot(x = theta2[complete.cases(iadlData),1], y = theta2[complete.cases(iadlData),2], xlab = expression(theta), ylab = "SE", main = "Complete Data Theta vs. SE(Theta)")
For the cases with complete data, this is the best the SE gets. When plotting all the data, you can see the impact of missing data on SE: fewer observations means a higher SE for the trait.
plot(x = theta2[,1], y = theta2[,2], xlab = expression(theta), ylab = "SE", main = "All Data Theta vs. SE(Theta)")
We can also plot the item difficulty values to get a sense of the the scale is telling us about the trait:
itemDif = unlist(lapply(X = coef2PL, FUN = function(x) return(x[5])))
plot(x = 1:7, y = itemDif[1:7], type = "l", xlab = "Item", ylab = "Item Difficulty (b)")
Here are the items, again:
Note that no items are available to measure above-average abilities well! The item difficulty for most items covers values of Theta between -1.0 to -0.5.
lavaan
: Limited Information Onlylavaan
only provides limited information estimates of
IRT/IFA models, which are parallel (but not necessarily equal to) Mplus’
WLSMV methods. This is a limitation of lavaan
, so if ML
versions of estimates are needed, Mplus will have to be used.
Alternatively, the mirt
package in R estimates IRT models
(and many other types), but has rather limited capabilities for SEM with
the models used and does not include model functions for continous
observed variables and provides very few SEM fit statistics.
We will compare the one-parameter vs. two-parameter models for Binary Responses using WLSMV Probit model. Beginning with the two-parameter model.
The two-parameter model is called the two-parameter logisitic model
if the logit link function is used (2PL), otherwise it is called the
two-parameter normal ogive (2PNO). The syntax is largely identical to
that use for CFA, however, item intercepts (item ~ 1
in
CFA) are now replaced by item thresholds (item | t#
where
#
is the number of the threshold, in order, from 1 through
the number of categories on the item minus one).
The normal ogive model provides a model for a categorical variable’s (\(Y\) “underlying” continuous analog).
model2PSyntax = "
# loadings/discrimination parameters:
IADL =~ dia1 + dia2 + dia3 + dia4 + dia5 + dia6 + dia7
# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1;
# factor mean:
IADL ~ 0;
# factor variance:
IADL ~~ 1*IADL
dia1 ~~ dia2
"
model2PEstimates = sem(model = model2PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
summary(model2PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan 0.6-19 ended normally after 84 iterations
Estimator DWLS
Optimization method NLMINB
Number of model parameters 15
Used Total
Number of observations 609 635
Model Test User Model:
Standard Scaled
Test Statistic 32.171 47.620
Degrees of freedom 13 13
P-value (Chi-square) 0.002 0.000
Scaling correction factor 0.709
Shift parameter 2.253
simple second-order correction (WLSMV)
Model Test Baseline Model:
Test statistic 18222.541 12045.449
Degrees of freedom 21 21
P-value 0.000 0.000
Scaling correction factor 1.514
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.999 0.997
Tucker-Lewis Index (TLI) 0.998 0.995
Robust Comparative Fit Index (CFI) 0.912
Robust Tucker-Lewis Index (TLI) 0.858
Root Mean Square Error of Approximation:
RMSEA 0.049 0.066
90 Percent confidence interval - lower 0.028 0.047
90 Percent confidence interval - upper 0.071 0.087
P-value H_0: RMSEA <= 0.050 0.487 0.084
P-value H_0: RMSEA >= 0.080 0.008 0.142
Robust RMSEA 0.277
90 Percent confidence interval - lower 0.135
90 Percent confidence interval - upper 0.417
P-value H_0: Robust RMSEA <= 0.050 0.011
P-value H_0: Robust RMSEA >= 0.080 0.982
Standardized Root Mean Square Residual:
SRMR 0.039 0.039
Parameter Estimates:
Parameterization Theta
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL =~
dia1 2.525 0.299 8.441 0.000 2.525 0.930
dia2 2.626 0.397 6.616 0.000 2.626 0.935
dia3 2.936 0.452 6.498 0.000 2.936 0.947
dia4 3.813 0.640 5.953 0.000 3.813 0.967
dia5 2.473 0.298 8.300 0.000 2.473 0.927
dia6 1.965 0.222 8.848 0.000 1.965 0.891
dia7 1.523 0.283 5.381 0.000 1.523 0.836
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.dia1 ~~
.dia2 0.354 0.192 1.841 0.066 0.354 0.354
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL 0.000 0.000 0.000
Thresholds:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
dia1|t1 -0.953 0.170 -5.616 0.000 -0.953 -0.351
dia2|t1 -2.803 0.388 -7.222 0.000 -2.803 -0.997
dia3|t1 -2.326 0.353 -6.600 0.000 -2.326 -0.750
dia4|t1 -1.627 0.325 -5.005 0.000 -1.627 -0.413
dia5|t1 -1.101 0.179 -6.147 0.000 -1.101 -0.413
dia6|t1 -1.445 0.171 -8.473 0.000 -1.445 -0.655
dia7|t1 -2.952 0.380 -7.759 0.000 -2.952 -1.621
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL 1.000 1.000 1.000
.dia1 1.000 1.000 0.136
.dia2 1.000 1.000 0.127
.dia3 1.000 1.000 0.104
.dia4 1.000 1.000 0.064
.dia5 1.000 1.000 0.141
.dia6 1.000 1.000 0.206
.dia7 1.000 1.000 0.301
Scales y*:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
dia1 0.368 0.368 1.000
dia2 0.356 0.356 1.000
dia3 0.322 0.322 1.000
dia4 0.254 0.254 1.000
dia5 0.375 0.375 1.000
dia6 0.453 0.453 1.000
dia7 0.549 0.549 1.000
R-Square:
Estimate
dia1 0.864
dia2 0.873
dia3 0.896
dia4 0.936
dia5 0.859
dia6 0.794
dia7 0.699
We can also inspect the residual polychoric correlation matrix to investigate model misfit. Note these are the raw residuals, not a normalized or standardized version. Below that, the modification indices are displayed.
resid(model2PEstimates)
$type
[1] "raw"
$cov
dia1 dia2 dia3 dia4 dia5 dia6 dia7
dia1 0.000
dia2 0.000 0.000
dia3 0.040 0.035 0.000
dia4 -0.015 -0.033 -0.052 0.000
dia5 -0.023 -0.020 -0.106 0.027 0.000
dia6 -0.036 -0.039 -0.045 0.023 0.026 0.000
dia7 -0.104 0.004 0.005 0.030 -0.034 0.063 0.000
$mean
dia1 dia2 dia3 dia4 dia5 dia6 dia7
0 0 0 0 0 0 0
$th
dia1|t1 dia2|t1 dia3|t1 dia4|t1 dia5|t1 dia6|t1 dia7|t1
0 0 0 0 0 0 0
modificationindices(model2PEstimates, sort. = TRUE)
From the modifiation indices, we see several trends we saw with the
M2()
residuals in R: that items 5 and 3 need some
additional help (as do items 1 and 3 and items 5 and 4).
lavaanCatItemPlot = function(lavObject, varname, sds = 3){
output = inspect(object = lavObject, what = "est")
if (!varname %in% rownames(output$lambda)) stop(paste(varname, "not found in lavaan object"))
if (dim(output$lambda)[2]>1) stop("plots only given for one factor models")
itemloading = output$lambda[which(rownames(output$lambda) == varname),1]
itemthresholds = output$tau[grep(pattern = varname, x = rownames(output$tau))]
factorname = colnames(output$lambda)
factormean = output$alpha[which(rownames(output$alpha) == factorname)]
factorvar = output$psi[which(rownames(output$psi) == factorname)]
factormin = factormean - 3*sqrt(factorvar)
factormax = factormean + 3*sqrt(factorvar)
factorX = seq(factormin, factormax, .01)
itemloc = which(lavObject@Data@ov$name == varname)
itemlevels = unlist(strsplit(x = lavObject@Data@ov$lnam[itemloc], split = "\\|"))
if (length(itemthresholds)>1){
plotdata = NULL
plotdata2 = NULL
itemY = NULL
itemY2 = NULL
itemX = NULL
itemText = NULL
for (level in 1:length(itemthresholds)){
itemY = pnorm(q = -1*itemthresholds[level] + itemloading*factorX)
itemY2 = cbind(itemY2, pnorm(q = -1*itemthresholds[level] + itemloading*factorX))
itemText = paste0("P(", varname, " > ", itemlevels[level], ")")
itemText2 = paste0("P(", varname, " = ", itemlevels[level], ")")
plotdata = rbind(plotdata, data.frame(factor = factorX, prob = itemY, plot = itemText))
if (level == 1){
plotdata2 = data.frame(factor = factorX, plot = itemText2, prob = matrix(1, nrow = dim(itemY2)[1], ncol=1) - itemY2[,level])
} else if (level == length(itemthresholds)){
plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = paste0("P(", varname, " = ", itemlevels[level+1], ")"), prob = itemY2[,level]))
} else {
plotdata2 = rbind(plotdata2, data.frame(factor = factorX, plot = itemText2, prob = itemY2[,level-1] - itemY2[,level]))
}
}
names(plotdata) = c(factorname , "Probability", "Cumulative")
ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Cumulative")) + geom_line(size = 2)
names(plotdata2) = c(factorname, "Response", "Probability")
ggplot(data = plotdata2, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
} else {
itemY = pnorm(q = -1*itemthresholds[1] + itemloading*factorX)
itemText2 = paste0("P(", varname, " = ", itemlevels[2], ")")
plotdata = data.frame(factor = factorX, prob = itemY, plot = itemText2)
names(plotdata) = c(factorname , "Probability", "Response")
ggplot(data = plotdata, aes_string(x = factorname, y = "Probability", colour = "Response")) + geom_line(size = 2)
}
}
lavaanCatItemPlot(lavObject = model2PEstimates, varname = "dia2", sds = 3)
convertTheta2IRT = function(lavObject){
if (!lavObject@Options$parameterization == "theta") stop("your model is not estimated with parameterization='theta'")
output = inspect(object = lavObject, what = "est")
if (ncol(output$lambda)>1) stop("IRT conversion is only valid for one dimensional factor models. Your model has more than one dimension.")
a = output$lambda
b = output$tau/output$lambda
return(list(a = a, b=b))
}
convertTheta2IRT(lavObject = model2PEstimates)
$a
IADL
dia1 2.525
dia2 2.626
dia3 2.936
dia4 3.813
dia5 2.473
dia6 1.965
dia7 1.523
$b
thrshl
dia1|t1 -0.377
dia2|t1 -1.067
dia3|t1 -0.792
dia4|t1 -0.427
dia5|t1 -0.445
dia6|t1 -0.735
dia7|t1 -1.939
To estimate the 1PNO model in lavaan
, we use the
following syntax. The summary, residuals, and modification indices are
displayed subsequently.
model1PSyntax = "
# loadings/discrimination parameters:
IADL =~ loading*dia1 + loading*dia2 + loading*dia3 + loading*dia4 + loading*dia5 + loading*dia6 + loading*dia7
# threshholds use the | operator and start at value 1 after t:
dia1 | t1; dia2 | t1; dia3 | t1; dia4 | t1; dia5 | t1; dia6 | t1; dia7 | t1;
# factor mean:
IADL ~ 0;
# factor variance:
IADL ~~ 1*IADL
"
model1PEstimates = sem(model = model1PSyntax, data = iadlData, ordered = c("dia1", "dia2", "dia3", "dia4", "dia5", "dia6", "dia7"),
mimic = "Mplus", estimator = "WLSMV", std.lv = TRUE, parameterization = "theta")
summary(model1PEstimates, fit.measures = TRUE, rsquare = TRUE, standardized = TRUE)
lavaan 0.6-19 ended normally after 25 iterations
Estimator DWLS
Optimization method NLMINB
Number of model parameters 14
Number of equality constraints 6
Used Total
Number of observations 609 635
Model Test User Model:
Standard Scaled
Test Statistic 62.072 64.108
Degrees of freedom 20 20
P-value (Chi-square) 0.000 0.000
Scaling correction factor 1.037
Shift parameter 4.227
simple second-order correction (WLSMV)
Model Test Baseline Model:
Test statistic 18222.541 12045.449
Degrees of freedom 21 21
P-value 0.000 0.000
Scaling correction factor 1.514
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.998 0.996
Tucker-Lewis Index (TLI) 0.998 0.996
Robust Comparative Fit Index (CFI) 0.902
Robust Tucker-Lewis Index (TLI) 0.897
Root Mean Square Error of Approximation:
RMSEA 0.059 0.060
90 Percent confidence interval - lower 0.043 0.044
90 Percent confidence interval - upper 0.076 0.077
P-value H_0: RMSEA <= 0.050 0.175 0.141
P-value H_0: RMSEA >= 0.080 0.019 0.026
Robust RMSEA 0.235
90 Percent confidence interval - lower 0.089
90 Percent confidence interval - upper 0.364
P-value H_0: Robust RMSEA <= 0.050 0.030
P-value H_0: Robust RMSEA >= 0.080 0.956
Standardized Root Mean Square Residual:
SRMR 0.065 0.065
Parameter Estimates:
Parameterization Theta
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL =~
dia1 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia2 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia3 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia4 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia5 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia6 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
dia7 (ldng) 2.630 0.166 15.817 0.000 2.630 0.935
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL 0.000 0.000 0.000
Thresholds:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
dia1|t1 -0.987 0.158 -6.248 0.000 -0.987 -0.351
dia2|t1 -2.807 0.190 -14.746 0.000 -2.807 -0.997
dia3|t1 -2.111 0.177 -11.893 0.000 -2.111 -0.750
dia4|t1 -1.161 0.162 -7.187 0.000 -1.161 -0.413
dia5|t1 -1.161 0.160 -7.266 0.000 -1.161 -0.413
dia6|t1 -1.844 0.172 -10.728 0.000 -1.844 -0.655
dia7|t1 -4.560 0.291 -15.684 0.000 -4.560 -1.621
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
IADL 1.000 1.000 1.000
.dia1 1.000 1.000 0.126
.dia2 1.000 1.000 0.126
.dia3 1.000 1.000 0.126
.dia4 1.000 1.000 0.126
.dia5 1.000 1.000 0.126
.dia6 1.000 1.000 0.126
.dia7 1.000 1.000 0.126
Scales y*:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
dia1 0.355 0.355 1.000
dia2 0.355 0.355 1.000
dia3 0.355 0.355 1.000
dia4 0.355 0.355 1.000
dia5 0.355 0.355 1.000
dia6 0.355 0.355 1.000
dia7 0.355 0.355 1.000
R-Square:
Estimate
dia1 0.874
dia2 0.874
dia3 0.874
dia4 0.874
dia5 0.874
dia6 0.874
dia7 0.874
resid(object = model1PEstimates)
$type
[1] "raw"
$cov
dia1 dia2 dia3 dia4 dia5 dia6 dia7
dia1 0.000
dia2 0.042 0.000
dia3 0.047 0.045 0.000
dia4 0.011 -0.002 -0.010 0.000
dia5 -0.035 -0.028 -0.102 0.050 0.000
dia6 -0.081 -0.080 -0.075 0.011 -0.022 0.000
dia7 -0.201 -0.088 -0.078 -0.036 -0.133 -0.066 0.000
$mean
dia1 dia2 dia3 dia4 dia5 dia6 dia7
0 0 0 0 0 0 0
$th
dia1|t1 dia2|t1 dia3|t1 dia4|t1 dia5|t1 dia6|t1 dia7|t1
0 0 0 0 0 0 0
modificationindices(model1PEstimates, sort. = TRUE)
To compare models, we use the anova()
function, which
uses the correct test for us. Here we see the 2PL is preferred to the
1PL, again.
anova(model1PEstimates, model2PEstimates)
Scaled Chi-Squared Difference Test (method = “satorra.2000”)
lavaan->lavTestLRT():
lavaan NOTE: The “Chisq” column contains standard test statistics, not the robust test that should be reported per model. A robust difference test is a function of two standard (not robust) statistics.
Df AIC BIC Chisq Chisq diff Df diff Pr(>Chisq)
model2PEstimates 13 32.171
model1PEstimates 20 62.072 22.238 7 0.002311 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1