```
<- '
model level: 1
fw =~ y1 + y2 + y3
fw ~ x1 + x2 + x3
level: 2
fb =~ y1 + y2 + y3
fb ~ w1 + w2
'
```

# Multilevel SEM

If the data is clustered, one way to handle the clustering is to use a multilevel modeling approach. In the SEM framework, this leads to multilevel SEM. The multilevel capabilities of lavaan are still limited, but you can fit a two-level SEM with random intercepts (note: only when all data is continuous).

## Multilevel SEM model syntax

To fit a two-level SEM, you must specify a model for both levels, as follows:

This model syntax contains two blocks, one for level 1, and one for level 2. Within each block, you can specify a model just like in the single-level case. To fit this model, using a toy dataset `Demo.twolevel`

that is part of the lavaan package, you need to add the `cluster=`

argument to the sem/lavaan function call:

`<- sem(model = model, data = Demo.twolevel, cluster = "cluster") fit `

The output looks similar to a multigroup SEM output, but where the two groups are now the within and the between level respectively.

`summary(fit)`

```
lavaan 0.6.17.1884 ended normally after 37 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 20
Number of observations 2500
Number of clusters [cluster] 200
Model Test User Model:
Test statistic 0.000
Degrees of freedom 10
P-value (Chi-square) 1.000
Parameter Estimates:
Standard errors Standard
Information Observed
Observed information based on Hessian
Level 1 [within]:
Latent Variables:
Estimate Std.Err z-value P(>|z|)
fw =~
y1 1.000
y2 0.771 0.035 21.883 0.000
y3 0.732 0.034 21.668 0.000
Regressions:
Estimate Std.Err z-value P(>|z|)
fw ~
x1 0.469 0.023 20.664 0.000
x2 0.375 0.022 17.145 0.000
x3 0.182 0.021 8.872 0.000
Intercepts:
Estimate Std.Err z-value P(>|z|)
.y1 0.000
.y2 0.000
.y3 0.000
.fw 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.y1 0.983 0.047 20.974 0.000
.y2 1.068 0.040 26.941 0.000
.y3 1.012 0.037 27.341 0.000
.fw 0.591 0.043 13.701 0.000
Level 2 [cluster]:
Latent Variables:
Estimate Std.Err z-value P(>|z|)
fb =~
y1 1.000
y2 0.720 0.053 13.712 0.000
y3 0.588 0.048 12.255 0.000
Regressions:
Estimate Std.Err z-value P(>|z|)
fb ~
w1 0.190 0.078 2.423 0.015
w2 0.124 0.076 1.633 0.102
Intercepts:
Estimate Std.Err z-value P(>|z|)
.y1 0.021 0.074 0.285 0.776
.y2 -0.019 0.060 -0.310 0.756
.y3 -0.044 0.054 -0.816 0.414
.fb 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.y1 0.059 0.047 1.246 0.213
.y2 0.120 0.031 3.799 0.000
.y3 0.149 0.028 5.315 0.000
.fb 0.883 0.117 7.528 0.000
```

After fitting the model, you can inspect the intra-class correlations:

`lavInspect(fit, "icc")`

```
y1 y2 y3 x1 x2 x3
0.334 0.266 0.233 0.000 0.000 0.000
```

The see the unrestricted (h1) within and between means and covariances, you can use

`lavInspect(fit, "h1")`

```
$within
$within$cov
y1 y2 y3 x1 x2 x3
y1 1.964
y2 0.760 1.651
y3 0.723 0.542 1.538
x1 0.490 0.400 0.369 0.982
x2 0.389 0.298 0.283 0.001 1.011
x3 0.167 0.124 0.125 -0.006 0.008 1.045
$within$mean
y1 y2 y3 x1 x2 x3
0.001 -0.002 -0.001 -0.007 -0.003 0.020
$cluster
$cluster$cov
y1 y2 y3 w1 w2
y1 0.985
y2 0.667 0.600
y3 0.545 0.392 0.468
w1 0.146 0.136 0.052 0.870
w2 0.077 0.050 0.124 -0.128 0.931
$cluster$mean
y1 y2 y3 w1 w2
0.017 -0.018 -0.044 0.052 -0.091
```

## Important notes

note that in

`level: 1`

the colon follows the`level`

keyword; if you type`level 1:`

, you will get an erroryou must specify a model for each level; the following syntax is not allowed and will produce an error:

`<- ' model level: 1 fw =~ y1 + y2 + y3 fw ~ x1 + x2 + x3 level: 2 '`

if you do not have a model in mind for level 2, you can specify a saturated level by adding all variances and covariances of the endogenous variables (here: y1, y2 and y3):

`<- ' model level: 1 fw =~ y1 + y2 + y3 fw ~ x1 + x2 + x3 level: 2 y1 ~~ y1 + y2 + y3 y2 ~~ y2 + y3 y3 ~~ y3 '`

## Convergence issues and solutions

By default, the current version of lavaan (0.6) uses a quasi-Newton procedure to maximize the loglikelihood of the data given the model (just like in the single-level case). For most model and data combinations, this will work fine (and fast). However, every now and then, you may experience convergence issues.

Non-convergence is typically a sign that something is not quite right with either your model, or your data. Typical settings are: a small number of clusters, in combination with (almost) no variance of an endogenous variable at the between level.

However, if you believe nothing is wrong, you may want to try another optimization procedure. The current version of lavaan allows for using the Expectation Maximization (EM) algorithm as an alternative. To switch to the EM algorithm, you can use:

```
<- sem(model = model, data = Demo.twolevel, cluster = "cluster",
fit verbose = TRUE, optim.method = "em")
```

As the EM algorithm is not accelerated yet, this may take a long time. It is not unusual that more than 10000 iterations are needed to reach a solution. To control when the EM algorithm stops, you can set the stopping criteria as follows:

```
<- sem(model = model, data = Demo.twolevel, cluster = "cluster",
fit verbose = TRUE, optim.method = "em", em.iter.max = 20000,
em.fx.tol = 1e-08, em.dx.tol = 1e-04)
```

The `em.fx.tol`

argument is used to monitor the change in loglikelihood between the current step and the previous step. If this change is smaller than `em.fx.tol`

, the algorithm stops. The `em.dx.tol`

argument is used to monitor the (unscaled) gradient. When a solution is reached, all elements of the gradient should be near zero. When the largest gradient element is smaller than `em.dx.tol`

, the algorithm stops.

A word of caution: the EM algorithm can always be forced to ‘converge’ (perhaps after changing the stopping criteria), but that does not mean you have a model/dataset combination that deserves to converge.