By and large, structural equation models are used to model the covariance matrix of the observed variables in a dataset. But in some applications, it is useful to bring in the means of the observed variables too. One way to do this is to explicitly refer to intercepts in the lavaan syntax. This can be done by including ‘intercept formulas’ in the model syntax. An intercept formula has the following form:
variable ~ 1
The left part of the expression contains the name of the observed or latent variable. The right part contains the number 1, representing the intercept. For example, in the three-factor H&S CFA model, we can add the intercepts of the observed variables as follows:
However, it is more convenient to omit the intercept formulas in the model syntax (unless you want to fix their values), and to add the argument meanstructure = TRUE in the fitting function. For example, we can refit the three-factor H&S CFA model as follows:
fit <-cfa(HS.model, data = HolzingerSwineford1939, meanstructure =TRUE)summary(fit, remove.unused =FALSE)
As you can see in the output, the model includes intercept parameters for both the observed and latent variables. By default, the cfa() and sem() functions fix the latent variable intercepts (which in this case correspond to the latent means) to zero. Otherwise, the model would not be estimable. Note the use of remove.unused = FALSE in the summary() call. This ensures that the fixed-to-zero latent intercepts are shown in the output. Since lavaan 0.6-17, these `unused’ parameters (that are not mentioned in the model syntax, but added automatically to the parameter table, with a default value) are no longer shown in the output (by default).
Note that the chi-square statistic and the number of degrees of freedom is the same as in the original model (without a mean structure). The reason is that we brought in some new data (a mean value for each of the 9 observed variables), but we also added 9 additional parameters to the model (an intercept for each of the 9 observed variables). The end result is an identical fit. In practice, the only reason why a user would add intercept-formulas in the model syntax, is because some constraints must be specified on them. For example, suppose that we wish to fix the intercepts of the variables x1, x2, x3 and x4 to, say, 0.5. We would write the model syntax as follows: