Part 1: Investigating the mean structure, and CLPM

Day 4: Trait, time lags, and measurement error

In this first part of the computer lab you are going to analyze the same data that were used in the Hamaker, Kuiper, and Grasman (2015) paper and were presented in the lecture. The goals of these exercises are to get some experience with fitting (random intercept) cross-lagged panel models, interpreting their results, and comparing them. Here, we focus on the relationship between Parental Responsiveness and Adolescents’ Depressive Symptomatology (note, in the lecture the focus was on a different concept, namely Parental Psychological Control).

The data (means, standard deviations, and the correlation matrix) are included in the file Soenens.dat on SURFdrive. The number of observations is 396. The data command for these data should be:

DATA:       
  TYPE = MEANS STDEVIATIONS CORRELATION;   
  FILE = Soenens.dat;
  NOBSERVATIONS = 396;

The variable command should be:

VARIABLE:   
  NAMES = PsCon1-PsCon3 Res1-Res3 BeCon1-BeCon3 Dep1-Dep3;
  USEVARIABLES = Res1 Dep1 Res2 Dep2 Res3 Dep3;

Exercise 1: Mean structure

Specify a model in which you investigate whether the group means can be constrained across time. Tip: Allow all variables to covary with each other (i.e. specify a model without constraints on the covariance structure) such that we are estimating means, and not intercepts.

Exercise 1A

Does this model fit?

The model specification is given in S20_Day4_Exercise1A.inp on SURFdrive. The model does not fit well, according to the \(\chi^{2}\) and the RMSEA; CFI and TLI indicate a good and barely acceptable model fit, respectively. See the slides of day 1 of this summer school if you need a refresher on the recommended cutoff values for these fit indices (although such cutoff recommendations remain a hotly debated topic among statisticians).

  • \(\chi^{2}\) (4) = 19.620, \(p < .001\)
  • RMSEA = 0.099
  • CFI = 0.986
  • TLI = 0.946

Exercise 1B

What can you do to improve the model fit?

We can look at the modification indices in the output. Here, we specified MOD(4) in the OUTPUT command, which indicates all parameters that can be added/freed that will lead to a decrease in chi-square of at least 4 points (which is a significant improvement combined with 1 df).

Based on the Mplus output, it appears that freeing the mean of Dep1 leads to the largest change in chi-square (M.I. = 10.53).

Exercise 1C

Run the adjusted model and discuss the results.

Syntax for the adjusted model is in S20_Day4_Exercise1C.inp. Looking at the fit indices of the adjusted model, the chi-square for this model is still significant, but all other measures indicate the model fit is okay to excellent. Moreover, the model is a significant improvement in comparison to the previous model: \(\Delta\chi^{2} = 19.62 - 8.71 = 10.91\), with \(\Delta\)df = 1 and, \(p = .001\).

You can calculate the p-value of the \(\Delta \chi^{2}\) using the pchisq()-function in R (with the lower.tail argument set to FALSE), or an online tool

Exercise 2: The CLPM

Exercise 2A

Specify the cross-lagged panel model (CLPM) in Mplus. Regress the observed variables directly on each other over time, as in the graph above. Try to think of where each parameter you estimate should go in the graph. Compare your model specification with the input file S20_Day4_Exercise2A.inp in SURFdrive.

The cross-lagged panel model.

Exercise 2B

Specify the model as in the graph below (the CLPM with centered latent variables), and again include the (significant) parameter estimates in this graph. Tip: Do not forget to constrain the measurement error variances of the observed variables to 0. Compare your model specification with the input file S20_Day4_Exercise2B.inp in SURFdrive.

The cross-lagged panel model with centered latent variables.

Exercise 2C

The two models should lead to the exact same model fit. Estimate the model and discuss the model fit. When comparing the parameter estimates, what is the difference between the model in exercise 2A and the model in exercise 2B?

The CLPM with centered latent variables is, just like the CLPM from exercise 2A, a traditional CLPM, but specified in such a way that the mean structure is first separated from the regression part of the model. This implies we model the means, rather than intercepts, which makes it easier to impose the constraint of identical means over time. So, the fit indices and most parameter estimates are equivalent, except for… (see next question)

Exercise 2D

Which parameter estimates differ? How can they be related?

The difference in parameter estimates is in the means/intercepts. For the first model we estimate the means at wave 1, and intercepts thereafter. For the second model we obtain mean estimates at each wave (although they are referred to as intercepts in the Mplus output).

Again, the reason for this is that in the first model, the means at wave 2 and 3 are partly predicted by the means from previous waves (through the lagged relationships). Therefore, the constants that are estimated should be interpreted as intercepts. In the second model, the mean part (triangles) and the regression part (lagged parameters) are separated.

Exercise 2E

Constrain the means in the second version of the CLPM. Please do not constrain the mean of Dep1, however, as this constraint proved untenable in exercise 1C. Discuss the model fit.

The input for these models are in S20_Day4_Exercise2E.inp. The chi-square has increased in comparison with the previous model: The reason is that this is a restricted model (a special case of the previous model). We can test whether the increase in misfit is significant using the chi-square difference test: \(\Delta\chi^{2} = 69.59 - 62.98 = 6.61\) with \(\Delta\)df = 7 – 4 = 3, \(p = .0854\). Hence, the constraints are tenable. We can also look at information criteria such as the AIC or BIC. The model with the lowest AIC or BIC is preferred.

Exercise 2F

Continue with the model from the previous question and constrain all the lagged parameters between wave 1 and 2 to be identical to the lagged parameters between wave 2 and 3. Discuss the model fit, and compare this model to the previous one to see whether the constraint holds.

The input for these models are in S20_Day3_Exercise2F.inp. The chi-square has again increased in comparison with the previous model: The reason is that this is a restricted model (a special case of the previous model). We can test whether the increase in misfit is significant: \(\Delta\chi^{2} 81.52 - 69.59 = 11.93\), \(\Delta\)(df) = 11 - 7 = 4, \(p = .0178\).

If we stick to an alpha of .05, the constraints must be rejected (as the increase in misfit is significant). Note that we are doing several model comparisons here, so one could also argue that we should use a smaller alpha to guard against an inflated overall type I error. Again, an alternative would be to use information criteria for model comparisons.

References

Hamaker, Ellen L., Rebecca M. Kuiper, and Raoul P. P. P. Grasman. 2015. “A Critique of the Cross-Lagged Panel Model.” Psychological Methods 20 (1): 102–16. https://doi.org/10.1037/a0038889.