Although I am using the terms classes and clusters synonymously, the general approach of LCA differs from that of the other methods previously discussed
LCA is a model-based method for clustering (or classification)
The other methods listed on the previous slide do not explicitly state a statistical model
By being model based, we are making very explicit assumptions about our data
\[f(Y_{pi}) = \left(\pi_i \right)^{Y_{pi}} \left(1-\pi_i\right)^{(1-Y_{pi})}\]
The likelihood function for \(Y\) looks similar:
If \(Y=1\), the likelihood is: \[f(Y=1) = \left(0.87 \right)^{1} \left(1-0.87\right)^{(1-1)} =0.87\]
If \(Y=0\), the likelihood is: \[f(Y=0) = \left(0.87 \right)^{1} \left(1-0.87\right)^{(1-0)} =0.13\]
This example shows you how the likelihood function of a statistical distribution gives you the likelihood of an event occurring
In the case of discrete-outcome variables, the likelihood of an event is synonymous with the probability of the event occurring
\[P(Y_1=1,Y_2=1) = \pi_1 \pi_2 = 0.87 \times 0.57 = 0.4959\]
\[P(Y_1=y_1, Y_2=y_2,\ldots,Y_I=y_I) = \prod_{i=1}^{I} \pi_i^{Y_i}\left(1-\pi_i\right)^{\left(1-Y_i\right)}\]
\[f(\textbf{Y}) = \sum_{c=1}^C \eta_c f(\textbf{Y}|c)\]
A latent class model for the response vector of \(I\) variables (\(i =1,\ldots,I\)) with \(C\) classes (\(c=1,\ldots,C\)):
\[f(\boldsymbol{Y}_i) = \displaystyle {\sum_{c=1}^{C}\eta_c} \prod_{i=1}^{I} \pi_{ic}^{Y_{pi}}\left(1-\pi_{ic}\right)^{1-Y_{pi}}\]
Where:
LCA Example #1
TITLE: LCA of Macready and Dayton's data (1977).
Two classes.
DATA: FILE IS mddata.dat;
VARIABLE: NAMES ARE u1-u4;
CLASSES = c(2);
CATEGORICAL = u1-u4;
ANALYSIS: TYPE = MIXTURE;
STARTS = 100 100;
OUTPUT: TECH1 TECH10;
PLOT: TYPE=PLOT3;
SERIES IS u1(1) u2(2) u3(3) u4(4);
SAVEDATA: FORMAT IS f10.5;
FILE IS examinee_ests.dat;
SAVE = CPROBABILITIES;
FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES
BASED ON THE ESTIMATED MODEL
Latent
Classes
1 83.29149 0.58656
2 58.70851 0.41344
\(\eta_c\) are proportions in far right column
RESULTS IN PROBABILITY SCALE
Latent Class 1
U1 Category 2 0.753 0.060
U2 Category 2 0.780 0.069
U3 Category 2 0.432 0.058
U4 Category 2 0.708 0.063
Latent Class 2
U1 Category 2 0.209 0.066
U2 Category 2 0.068 0.056
U3 Category 2 0.018 0.037
U4 Category 2 0.052 0.057
\(\pi_{ic}\) are proportions in left column, followed by asymptotic standard errors
\[f(\boldsymbol{Y}_p) = \displaystyle {\sum_{c=1}^{C}\eta_c} \prod_{i=1}^{I} \pi_{ic}^{Y_{pi}} \left(1-\pi_{pi}\right)^{1-Y_{pi}}\]
Class Probabilities:
Class Probability
1 0.587
2 0.413
Item Parameters
class: 1
item prob SE(prob)
1 0.753 0.051
2 0.780 0.051
3 0.432 0.056
4 0.708 0.054
class: 2
item prob SE(prob)
1 0.209 0.060
2 0.068 0.048
3 0.018 0.029
4 0.052 0.044
RESPONSE PATTERN FREQUENCIES AND CHI-SQUARE CONTRIBUTIONS
Response Frequency Standard Chi-square Contribution
Pattern Observed Estimated Residual Pearson Loglikelihood Deleted
1 41.00 41.04 0.01 0.00 -0.08
2 13.00 12.91 0.03 0.00 0.18
3 6.00 5.62 0.16 0.03 0.79
4 7.00 8.92 0.66 0.41 -3.39
5 1.00 1.30 0.27 0.07 -0.53
6 3.00 1.93 0.77 0.59 2.63
7 2.00 2.08 0.05 0.00 -0.15
8 7.00 6.19 0.33 0.10 1.71
9 4.00 4.04 0.02 0.00 -0.07
10 6.00 6.13 0.05 0.00 -0.26
11 5.00 6.61 0.64 0.39 -2.79
12 23.00 19.74 0.79 0.54 7.04
13 4.00 1.42 2.18 4.70 8.29
14 1.00 4.22 1.59 2.46 -2.88
15 4.00 4.90 0.41 0.16 -1.62
16 15.00 14.95 0.01 0.00 0.09
The likelihood ratio Chi-square is a variant of the Pearson Chi-squared test, but still uses the observed and expected frequencies for each cell
The formula for this test is: \[G = 2 \sum_r O_r \ln\left(\frac{O_r}{E_r}\right)\]
The degrees of freedom are still the same as the Pearson Chi-squared test, however
Chi-Square Test of Model Fit for the Binary
and Ordered Categorical (Ordinal) Outcomes
Pearson Chi-Square
Value 9.459
Degrees of Freedom 6
P-Value 0.1494
Likelihood Ratio Chi-Square
Value 8.966
Degrees of Freedom 6
P-Value 0.1755
\[L(\boldsymbol{Y}_p) =\prod_{p=1}^N \left[\displaystyle {\sum_{c=1}^{C}\eta_c} \prod_{i=1}^{I} \pi_{ic}^{Y_{pi}} \left(1-\pi_{pi}\right)^{1-Y_{pi}} \right]\]
\[Log L(\boldsymbol{Y}_{pi}) =\log \left(\prod_{p=1}^N \left[\displaystyle {\sum_{c=1}^{C}\eta_c} \prod_{i=1}^{I} \pi_{ic}^{Y_{pi}} \left(1-\pi_{pi}\right)^{1-Y_{pi}} \right]\right)\]
\[Log L(\boldsymbol{Y}_{pi}) = \sum_{p=1}^N \log \left( \displaystyle {\sum_{c=1}^{C}\eta_c} \prod_{i=1}^{I} \pi_{ic}^{Y_{pi}} \left(1-\pi_{pi}\right)^{1-Y_{pi}}\right)\]
Here, the log function taken is typically base \(e\) - the natural log
The log likelihood is a function of the observed responses for each person and the model parameters
\[AIC = 2q - 2 \log L\]
\[BIC = q \log (N) - 2 \log L\]
TESTS OF MODEL FIT
Loglikelihood
H0 Value -331.764
Information Criteria
Number of Free Parameters 9
Akaike (AIC) 681.527
Bayesian (BIC) 708.130
Sample-Size Adjusted BIC 679.653
(n* = (n + 2) / 24)
Entropy 0.754
\[\hat{\bar{Y}}_i = \hat{E}(Y_i) = \sum_{c=1}^J \hat{\eta}_c \times \hat{\pi}_{ic}\]
Across all items, you can then form an aggregate measure of model fit by comparing the observed mean of the item to that found under the model, such as the root mean squared error (note: this is not RMSEA from CFA/IFA): \[RMSE = \sqrt{\frac{\sum_{i=1}^I(\hat{\bar{Y}}_i - \bar{Y}_i)^2}{I}}\]
Often, there is not much difference between observed and predicted mean (depending on the model, the fit will always be perfect)
UNIVARIATE MODEL FIT INFORMATION
Estimated Probabilities
Variable H1 H0 Standard Residual
U1
Category 1 0.472 0.472 0.000
Category 2 0.528 0.528 0.000
U2
Category 1 0.514 0.514 0.000
Category 2 0.486 0.486 0.000
U3
Category 1 0.739 0.739 0.000
Category 2 0.261 0.261 0.000
U4
Category 1 0.563 0.563 0.000
Category 2 0.437 0.437 0.000
\[\hat{P}(Y_a = 1, Y_b=1) = \sum_{c=1}^C \hat{\eta}_c \times \hat{\pi}_{ac} \times \hat{\pi}_{bc}\]
BIVARIATE MODEL FIT INFORMATION
Estimated Probabilities
Variable Variable H1 H0 Standard Residual
U1 U2
Category 1 Category 1 0.352 0.337 0.391
Category 1 Category 2 0.120 0.135 -0.540
Category 2 Category 1 0.162 0.177 -0.483
Category 2 Category 2 0.366 0.351 0.387
\[EN(\boldsymbol{\alpha}) = - \sum_{i=1}^N \sum_{c=1}^C \hat{\alpha}_{ic} \log \hat{\alpha}_{ic}\]
The entropy equation on the last slide is bounded from \([0,\infty)\), with higher values indicated a larger amount of uncertainty in classification
Mplus reports the relative entropy of a model, which is a rescaled version of entropy:
\[E = 1 - \frac{EN(\boldsymbol{\alpha})}{N \log C}\]
The relative entropy is defined on \([0,1]\), with values near one indicating high certainty in classification and values near zero indicating low certainty
TESTS OF MODEL FIT
Loglikelihood
H0 Value -331.764
Information Criteria
Number of Free Parameters 9
Akaike (AIC) 681.527
Bayesian (BIC) 708.130
Sample-Size Adjusted BIC 679.653
(n* = (n + 2) / 24)
Entropy 0.754
The finite mixture model is a general framework for modeling data that can be expressed as:
\[f(\boldsymbol{Y}) = \sum_{c=1}^C \eta_c f(\boldsymbol{Y}|c)\]
\[f(\boldsymbol{Y}_p \mid \theta_p) = \sum_{c=1}^C \eta_c f(\boldsymbol{Y}_p|c, \theta_p)\]
Where:
\[f(\boldsymbol{Y}|c) = \prod_{i=1}^I \pi_{ic}^{Y_{pi}} \left( 1-\pi_{ic} \right)^{1-Y_{pi}}\]
And:
\[\pi_{ic} = P \left( Y_{pi} = 1 \mid \theta_p \right) = \frac{\exp \left( a_{ic} \left( \theta_p - b_{ic}\right) \right)}{1+\exp \left( a_{ic} \left( \theta_p - b_{ic}\right) \right)}\]
Multidimensional Measurement Models (Fall 2023): Lecture 10