Unwanted Multdimensionality


Multidimensional Measurement Models (Fall 2023): Lecture 9

Today’s Lecture

  • Types of MIRT Models
    • Compensatory
    • Partially Compensatory
  • Descriptions of MIRT Models
    • Multidimensional Discrimination Index
    • Multidimensional Difficulty Index
    • Vector Item Plots
  • How to detect multidimensionality
    • PCA/Exploratory Factor Analysis (not helpful)
    • Relative model fit (depending on specific model comparions)
  • How to deal with multidimensionality
    • Removing misfitting items
    • Controlling for unwanted effects using auxiliary dimensions
    • Hoping dimensionality doesn’t matter (it does)

Lecture Overview

  • To describe what to do when multidimensionality is unwanted, we must first examine the types of MIRT models
    • Why: The nature of multidimensionality is important for determining what to do about it
  • When multidimensionality is unwanted, one remedy is ignoring it
    • Why: One type of multidimensionality is proportional to what a composite of unidimensional latent variables would look like
  • But, to get there, we must first dive into the models
  • Of note, the first section focuses largely on items that measure more than one dimension
    • This is where confusion often arises:
      • Most of the time, such cases are exploratory (although not always)
      • In such cases, these descriptive measures are seldom used, but they are used to help understand if a multidimensional model can be approximated well by a unidimensional model

Types of MIRT Models

A common classification of MIRT models is one of compensatory vs. non-compensatory models

  • Compensatory models allow for the effects of one dimension to be compensated for by the effects of another dimension
    • Example: A student who is low in math ability but high in reading ability may still be able to answer a math word problem correctly
  • Non-compensatory models do not allow for the effects of one dimension to be compensated for by the effects of another dimension
    • Example: A student who is low in math ability but high in reading ability may not be able to answer a math word problem correctly

Mathematical Distinctions of Compensatory vs. Non-Compensatory Models

Compensatory models are almost always the same form: additive within the space of the link function

For a binary item \(i\), measuring two dimensions \(\theta_1\) and \(\theta_2\), the probability of a correct response is: \[ P(Y_{pi} = 1 | \boldsymbol{\theta}_p) = \frac{\exp \left(\mu_i + \lambda_{i1} \theta_{p1} + \lambda_{i2} \theta_{p2} \right)} {1+\exp \left(\mu_i + \lambda_{i1} \theta_{p1} + \lambda_{i2} \theta_{p2} \right)} \]

What makes this model compensatory is due to the additive nature of the traits

Visualizing Compensatory Models

To show what a compensatory model looks like graphically, we can plot the item response function for a single item where:

  • \(\mu_i\) = -0.7$, \(\lambda_{i1}\) = 0.75$, \(\lambda_{i2}\) = 1.5$, and
  • \(E(\boldsymbol{\theta}) = \boldsymbol{0}\); \(\text{diag}\left(Var(\boldsymbol{\theta})\right) = \text{diag}\left( \boldsymbol{I}\right)\)
Loading required package: plotly
Loading required package: ggplot2

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

The following object is masked from 'package:stats':

The following object is masked from 'package:graphics':


Compensatory: Same Probability of Correct Response for Different Profiles

The contour lines on the previous slide show a line of equal probability for varying values of \(\theta_1\) and \(\theta_2\)

  • This means that the probability of a correct response is the same for different profiles
    • A person with \(\theta_1=0\) and \(\theta_2=.5\) has a 0.5 probability of a correct response
    • A person with \(\theta_1=4\) and \(\theta_2=-1.5\) has a 0.5 probability of a correct response
      • A high value of \(\theta_1\) is compensating for a low value of \(\theta_2\)

Non-Compensatory Models

Non-Compensatory models (called partially compensatory models in Reckase (2009) are more complicated, mathematically

One model from from Sympson (1979):

\[ P(Y_{pi} = 1 | \boldsymbol{\theta}_p) = c_i + (1-c_i) \prod_{d=1}^2 \frac{\exp \left(a_{id} \left(\theta_{pd} - b_{id} \right) \right)} {1+\exp \left(a_{id} \left(\theta_{pd} - b_{id} \right) \right)} \]

Notice how the product takes the place of the sum in the compensatory model

Non-Compensatory Models

  • Many non-compensatory psychometric models started with similar parameterizations
    • Later, we will see Diagnostic Classification Models (DCMs) that are non-compensatory
    • However, with such models, we can show that the non-compensatory mechanism is equivalent to having a latent variable interaction in an additive model

\[ P(Y_{pi} = 1 | \boldsymbol{\theta}_p) = c_i + (1-c_i) \frac{\exp \left(\mu_i + \lambda_{i1} \theta_{p1} + \lambda_{i2} \theta_{p2} + \lambda_{i12} \theta_{p1} \theta_{p2} \right)} {1+\exp \left(\mu_i + \lambda_{i1} \theta_{p1} + \lambda_{i2} \theta_{p2} + \lambda_{i12} \theta_{p1} \theta_{p2} \right)} \]

  • Is this also true for non-compensatory MIRT models? (Question for future research)

Visualizing Non-Compensatory Models

To show what the Sympson (1979) non-compensatory model looks like graphically, we can plot the item response function for a single item where:

  • \(b_1\) = -0.5$, \(b_2 = .5\), \(c=0.2\), \(a_1 = 0.75\), \(a_2 = 1.1\), and
  • \(E(\boldsymbol{\theta}) = \boldsymbol{0}\); \(\text{diag}\left(Var(\boldsymbol{\theta})\right) = \text{diag}\left( \boldsymbol{I}\right)\)

More on Non-Compensatory

The term non-compensatory is used because a person cannot compensate for a deficiency in one dimension with a strength in another dimension

  • Here is a plot of the same model, but with \(\theta_1\) fixed at -4
    • Notice the range of the Y-axis–the maximum probability is 0.2054407
    • This occurs at \(\theta_2\) = 4

Another Way: Non-Compensatory Via Latent Variable Interactions

  • Here is the same contour plot, but with a latent variable interaction parameter

Descriptive Statistics of MIRT Models

For compensatory MIRT models, we can define some descriptive statistics that will help to show how a model functions

  • To do this, we must first define the \(\boldsymbol{\theta}\) space–the space of the latent variables
    • For simplicity, we will assume that \(E(\boldsymbol{\theta}) = \boldsymbol{0}\) and \(\text{diag}\left(Var(\boldsymbol{\theta})\right) = \text{diag}\left( \boldsymbol{I}\right)\)
  • For two dimensions, the \(\boldsymbol{\theta}\) space is:

Theta Space: Contour Plot of MIRT ICC

Next, envision we have an item that measures two dimensions, \(\theta_1\) and \(\theta_2\)

  • We can then overlay the \(\boldsymbol{\theta}\) space with the equi-probablity contours of the item response function
    • This is called the Item Characteristic Curve (ICC)
    • The ICC is the probability of a correct response as a function of the latent variables

Direction of Steepest Slope: Direction of Measurement

A number of researchers (e.g., Muraki and Carlson, 1995, Reckase, 2009) define the direction of measurement as the direction of steepest slope of the ICC

  • We can show this direction with a dashed line in the plot

The slope of the dashed line comes from trigonometry and considers a triangle with sides \(\theta_1\) and \(\theta_2\)

  • Here, for a one unit change from the origin in \(\theta_1\) (or , \(\theta_1=1\)), we need the hypotenuse of the triangle formed by the line
    • But first, we need to describe the angle of the contours and the location of the 50% probability contour

Multidimensional Discrimination Index

The vector on the previous plot is oriented in the direction of measurement and has length proportional to the slope of the item in each direction (a “multidimensional slope”)

  • We can determine the length of the vector by using the Pythagorean Theorem
    • The length of the vector is the square root of the sum of the squares of the slopes in each direction
    • This is called the Multidimensional Discrimination Index (MDISC)

\[ \text{MDISC}_i = \sqrt{\sum_{d=1}^D \lambda_{id}^2} \]

“Direction of Measurement”

The “Direction of Measurement” (quotes used to denote a term that may not mean what it describes) is then the angle of the vector eminating from the origin in the direction of the steepest slope, relative to one dimension (here \(\theta_1\))

In radians: \[ \text{DOM}_i = \arccos \left( \frac{\lambda_{i1}}{MDISC_i} \right) \]

In degrees:

\[ \text{DOM}_i = \arccos \left( \frac{\lambda_{i1}}{MDISC_i} \right) \left(\frac{180}{\pi}\right) \]

Multidimensional Difficulty Index

Similar to MDISC (multidimensional discrimination), we can also define multidiemnsional difficulty:

\[ \text{MDIFF}_i = \frac{\mu_i}{\text{MDISC}_i} \]

This is the distance between the origin of the \(\boldsymbol{\theta}\)-space and the point where the direction of measurment intersects the 50% probability contour

Multidimensional Difficulty Displayed

Vector Item Plots

We can use MDIFF and MDISC to plot items as vectors in two dimensions:

Later, we will see that this plot contributes to the “hope” solution of multidimensionality

Additional Information on MDISC and MDIFF

Wes Bonifay’s (2020) Sage book has a nice picture of a visual interpretation of MDISC and MDIFF

How to Detect Multidimensionality

As you are experiencing in HW4, there have been a number of methods developed to determine if an assessment is multidimensional

  • Principal components-based methods
    • PCA
    • Exploratory Factor Analysis Using Matrix Decompositions
  • ML-based Exploratory Factor Analysis
    • As we’ve seen, there is no such thing–only differing constraints
  • Model-based methods (comparing lower-echelon Q-matrices)
    • Relative model fit
      • Note: no absolute model fit as all models will fit perfectly with full information model fit indices

Principal Components-Based Methods

  • PCA is a matrix decomposition method that finds the linear combination of variables that maximizes the variance
  • From matrix algebra, consider a square and symmetric matrix \(\boldsymbol{\Sigma}\) (e.g., a correlation or covariance matrix)
    • There exist a vector of eigenvalues \(\boldsymbol{\lambda}\) and a matrix of corresponding eigenvectors \(\boldsymbol{E}\) such that we can show that Sigma can be decomposed as:

\[ \boldsymbol{\Sigma}\boldsymbol{E} = \boldsymbol{\lambda} \boldsymbol{E} \]

\[ \boldsymbol{\Sigma} = \boldsymbol{E} \boldsymbol{\lambda} \boldsymbol{E}^T = \sum_{i=1}^p \lambda_i \boldsymbol{e}_i \boldsymbol{e}_i^T \]

  • We use the matrix of eigenvalues to help determine if a matrix is multidimensional

More PCA

The eigendecomposition (the factorization of a covariance or correlation matrix) of estimates the eigenvalues and eigenvectors using a closed form solution (called the characteristic polynomial)

  • The eigenvalues are the roots of the characteristic polynomial
  • The eigenvectors are the vectors that are unchanged by the transformation of the matrix
    • The eigenvectors are the directions of the principal components

The Components of PCA

The “components” are linear combinations of the variables that are uncorrelated and are ordered by the amount of variance they explain

  • The first component is the linear combination of variables that explains the most variance
  • The second component is the linear combination of variables that explains the second most variance, and so on

\[ \begin{align} C_{p1} &= \boldsymbol{e}_1^T \boldsymbol{Y} \\ C_{p2} &= \boldsymbol{e}_2^T \boldsymbol{Y} \\ &\vdots \\ C_{pI} &= \boldsymbol{e}_p^T \boldsymbol{Y} \\ \end{align} \]

PCA vs. Latent Variable Modeling

So, PCA is the process of developing hypothetical, uncorrelated, linear combinations of the data

  • One can almost envision why PCA gets used–sum scores
    • But, the components are not latent variables
    • And, the latent variables in latent variabel models don’t purport a sum
  • But, in the 1930s (and slightly before), this was the technology that was available
    • And, it is still used today
  • Factor analytic versions of PCA replace the diagonal of the covariance/correlation matrix with factor-analytic friendly terms (uniqueness)
    • Then does PCA

Many Issues with PCA

  • Solutions are widely unstable (sampling distributions of eigenvalues/eigenvectors are quite diffuse)
  • Not a good match to latent variable models directly
  • When data are not continuous (or plausibly continuous), Pearson correlation/covariance matrix is not appropriate
  • Missing data are an issue (assumed MCAR) as correlations pairwise delete missing data
    • Can you envision a method to fix some of these issues?

Conducting a PCA

A PCA yields eigenvalues, which get used to describe how many “factors” are in the data

  • We then use the eigenvalues to determine how many factors to extract
    • A plot of the raw eigenvalues is given by the scree plot:

Attaching package: 'psych'
The following objects are masked from 'package:ggplot2':

    %+%, alpha
pearsonCovEigen = eigen(cov(mathData, use = "pairwise.complete.obs"))
pearsonCorEigen = eigen(cor(mathData, use = "pairwise.complete.obs"))
tetrachoricCorEigen = eigen(tetrachoric(mathData)$rho)
eigens = cbind(pearsonCovEigen$values, pearsonCorEigen$values, tetrachoricCorEigen$values)
number = cbind(1:nrow(eigens), 1:nrow(eigens), 1:nrow(eigens))

matplot(x = number, y = eigens, type="l", lwd=3)
legend("topright", legend = c("Covariance", "Correlation", "Tetrachoric"), lwd = 3, col = 1:3)

Variance Accounted For Scree Plot

Because the sum of the eigenvalues equals the trace of the matrix, these values are often normalized to sum to 1

  • Indicating amount of variance explained by each component
    • Look for the “bend” in the plot
pearsonCovEigenRel = eigen(cov(mathData, use = "pairwise.complete.obs"))
pearsonCovEigenRel$values = pearsonCovEigenRel$values/sum(pearsonCovEigenRel$values)

pearsonCorEigenRel = eigen(cor(mathData, use = "pairwise.complete.obs"))
pearsonCorEigenRel$values = pearsonCorEigenRel$values/sum(pearsonCorEigenRel$values)

tetrachoricCorEigenRel = eigen(tetrachoric(mathData)$rho)
tetrachoricCorEigenRel$values = tetrachoricCorEigenRel$values/sum(tetrachoricCorEigenRel$values)

eigens = cbind(pearsonCovEigenRel$values, pearsonCorEigenRel$values, tetrachoricCorEigenRel$values)
number = cbind(1:nrow(eigens), 1:nrow(eigens), 1:nrow(eigens))

matplot(x = number, y = eigens, type="l", lwd=3)
legend("topright", legend = c("Covariance", "Correlation", "Tetrachoric"), lwd = 3, col = 1:3)

What Do Factors Mean?

After determining the number of factors, the next step is to determine what the factors mean

  • Often, the eigenvectors are used…but first, they are often rotated
  • Rotation is a process of rotating the axes of the factors to make them more interpretable
  • There are many methods of rotation
    • Orthogonal rotation methods (e.g., varimax, quartimax)
    • Oblique rotation methods (e.g., promax, oblimin)
             [,1]        [,2]        [,3]        [,4]
 [1,] -0.29488081  0.10153886 -0.20131875 -0.08503269
 [2,] -0.13781048  0.14609952  0.23625998 -0.22186797
 [3,] -0.27687531  0.07880159 -0.25413778  0.01619808
 [4,] -0.23901033  0.15259905 -0.14818152 -0.17800873
 [5,] -0.20514764  0.03284329 -0.02945147 -0.07305243
 [6,] -0.25312820 -0.35567205 -0.11509346 -0.17544714
 [7,] -0.26068430 -0.27164513 -0.10130783 -0.11930544
 [8,] -0.29079122 -0.25932320 -0.11091923  0.05796744
 [9,] -0.29694818 -0.25099253 -0.10306566 -0.14563199
[10,] -0.29048894 -0.07996012  0.01989836 -0.09075951
[11,] -0.08760734  0.10151296  0.36457520 -0.47133241
[12,] -0.14783080  0.24009739  0.16437918  0.31369328
[13,] -0.19443827 -0.10386534  0.32745399  0.25912577
[14,] -0.16890773 -0.28475166  0.45303066  0.18472929
[15,] -0.19220479 -0.15763035  0.29118562  0.41184417
[16,] -0.23563429  0.24287980  0.05154379  0.11779284
[17,] -0.18504507  0.32525734 -0.20754095  0.34385900
[18,] -0.24187247  0.28061878 -0.18221381  0.12954723
[19,] -0.12591548  0.31952918  0.33135165 -0.28318730
[20,] -0.17916598  0.26558927  0.15328453 -0.08888260

PCA Questions:

  1. Are our data unidimensional based on this PCA?
  2. If no: how many factors do we have?
  3. What do our factors mean?

None of these questions have a clear answer in this framework

  • Additionally, the PCA-to-latent variable mapping only assumes linearity
    • Or compensatory processes…

Exploratory Factor Analysis by Model Comparisons

An alternate approach is to conduct an EFA using model comparisons

  • Here, we compare the relative model fit of lower-echelon Q-matrices with differing dimensions
anova(model1D, model2D, model3D)
             AIC    SABIC       HQ      BIC    logLik      X2 df p
model1D 26497.86 26570.90 26573.56 26697.95 -13208.93             
model2D 26382.09 26489.82 26493.74 26677.22 -13132.05 153.772 19 0
model3D 26341.44 26482.03 26487.16 26726.60 -13093.72  76.651 18 0

Are these data unidimensional?

Flaws in EFA by Model Comparisons

  • Identification is build based on compensatory models
    • May not map onto non-compensatory models well
  • Unknown behavior if item(s) where zeroes are put are multidimensional

A Better Solution: Confirming One Dimension/Exploring Model Misfit

  • Given the problems with the just-identified models approach to EFA, we can use a better approach
  • Investigate model fit to a unidimensional model using limited information fit statistics
    • If the model fits: Unidimensionality seems plausible
    • If the model does not fit: Investigate sources of local misfit

Investigating Residual Covariances

With the residual covariances, we can investigate the sources of local misfit

  • We can look to see if items indicate that additional latent variables should be added

What to do when multidimensionality is unwanted

  • How to deal with multidimensionality
    • Remove misfitting items (or item pairs)
    • Controlling for unwanted effects using auxiliary dimensions
    • Marginalizing over unwanted dimensions
    • Hoping dimensionality doesn’t matter

Removing Misfitting Items

We can look to see of an item is involved in a lot of misfitting item pairs and can remove that item:

Here, items 23, 24, and 25 are part of the top 5 misfitting item pairs

New Model

Results removing item 24:

New Model Fit:

            M2  df p      RMSEA    RMSEA_5   RMSEA_95      SRMSR       TLI
stats 422.1373 152 0 0.04023179 0.03568515 0.04479584 0.04185533 0.9273073
stats 0.9353843

Old Model Fit:

            M2  df p      RMSEA    RMSEA_5   RMSEA_95      SRMSR       TLI
stats 520.1029 170 0 0.04330841 0.03907507 0.04755851 0.04449725 0.9121947
stats 0.9214374

Problems With Removing Items

  • Removing misfitting items changes the meaning of the test (validity)
    • But, leaving them in changes makes the validity of the test questionable
  • To calculate model fit, item pairs need at least some observations on each
    • Linking designs may not permit model fit tests

Controlling for unwanted effects using auxiliary dimensions

  • Sometimes, multidimensionality may be caused by dimensions beyond ability
  • For example
    • If raters are providing data, there may be rater data
    • Items with a common stem may need a testlet effect
  • In such cases, adding non-reported dimensions to the psychometric model will control for the unwanted effects
    • But, estimation may be difficult

Marginalizing Over Unwanted Dimensions

A more recent method is to marginalize over unwanted dimensions:

  • A two-dimensional model is estimated
  • A single score is reported (integrate over the other score)

Reference: Ip, E. H., & Chen, S. H. (2014). Using projected locally dependent unidimensional models to measure multidimensional response data. In Handbook of Item Response Theory Modeling (pp. 226-251). Routledge.

Hoping Dimensionality Doesn’t Matter

Finally, what appears most common is to “hope” multidimensionality won’t greatly impact a unidimensional model

  • A unidimensional model fit to multidimensional data can have approximately good scores* if the vector plot has items in approximately the same direction
    • *Here, a score is a composite of scores across all dimensions
  • Reckase & Stout (1995) note a proof for “essential” unidimensionality
    • In such cases single scores may be a good reflection of multiple abilities
  • It appears we can now test this hypothesis via model comparisons with latent variable interaction models

Reckase MD, Stout W (1995) Conditions under which items that assess multiple abilities will be fit by unidimensional IRT models. Paper presented at the European meeting of the Psychometric Society, Leiden, The Netherlands

Wrapping Up

Today’s lecture was a lot! Here is a big-picture summary

  • Methods for detecting multidimensionality are numerous
    • Many aren’t very stable
  • My preferred method is still confirmatory