<- '
model # efa block 1
efa("efa1")*f1 +
efa("efa1")*f2 =~ x1 + x2 + x3 + x4 + x5 + x6
# efa block 2
efa("efa2")*f3 +
efa("efa2")*f4 =~ y1 + y2 + y3 + y4 + y5 + y6
# cfa block
f5 =~ z7 + z8 + z9
f6 =~ z10 + z11 + z12
# regressions
f3 ~ f1 + f2
f4 ~ f3
'
ESEM and EFA
If a measurement model contains multiple latent variables (factors), we usually know which indicators belong to each factor. We call this the factor structure. Confirmatory factor analysis can be used to check if this a priori factor structure holds in the data. There are settings, however, where the factor structure is unclear, and we wish to rotate the solution in order to find a suitable structure in a given model. When the model also includes a structural part (i.e., regressions among the latent variables), this is referred to as exploratory structural equation modeling or ESEM. If there is only a measurement part, this is called exploratory factor analysis (EFA). What they have in common is that the factor structure (for one or more blocks) is found by means of rotation.
ESEM
To illustrate how ESEM works in lavaan, consider the following syntax:
This model syntax defines six latent variables (or factors). For f5 and f6, the factor structure is known, and they belong to a regular CFA block. But for f1 and f2, the factor structure is not known, and we will use a rotation method to find an appropiate structure. The f1 and f2 factors belong together in an EFA block that is (arbitrarily) named efa1
. The efa("efa1")*
modifier just before f1 and f2 is used to alert lavaan that these two factors belong to the same EFA block. The factors f3 and f4 belong to a different EFA block (named efa2
) and will be rotated independently. The structural part of the model is given as usual. To fit this model, we could call the sem()
function as follows:
<- sem(model = model, data = myData, rotation = "geomin") fit
Different rotation criteria are available, and many rotation options can be provided (see the manual page for the efa()
function for an overview).
To illustrate ESEM, we will borrow an example from the Mplus User’s Guide (example 5.25). First we read in the data:
<- read.table("http://statmodel.com/usersguide/chap5/ex5.25.dat")
ex5_25 names(ex5_25) = paste0("y",1:12)
The model syntax contains a single EFA block (efa1
for factors f1 and f2) and single CFA block (for f3 and f4):
<- '
model # efa block
efa("efa1")*f1 +
efa("efa1")*f2 =~ y1 + y2 + y3 + y4 + y5 + y6
# cfa block
f3 =~ y7 + y8 + y9
f4 =~ y10 + y11 + y12
# regressions
f3 ~ f1 + f2
f4 ~ f3
'
The following command illustrates the use of various rotation arguments:
<- sem(model = model, data = ex5_25, rotation = "geomin",
fit # mimic Mplus
information = "observed",
rotation.args = list(rstarts = 30, row.weights = "none",
algorithm = "gpa", std.ov = TRUE,
geomin.epsilon = 0.0001))
summary(fit)
lavaan 0.6-19 ended normally after 35 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 34
Row rank of the constraints matrix 2
Rotation method GEOMIN OBLIQUE
Geomin epsilon 1e-04
Rotation algorithm (rstarts) GPA (30)
Standardized metric TRUE
Row weights None
Number of observations 500
Model Test User Model:
Test statistic 51.353
Degrees of freedom 46
P-value (Chi-square) 0.272
Parameter Estimates:
Standard errors Standard
Information Observed
Observed information based on Hessian
Latent Variables:
Estimate Std.Err z-value P(>|z|)
f1 =~ efa1
y1 0.751 0.048 15.621 0.000
y2 0.858 0.042 20.469 0.000
y3 0.736 0.045 16.343 0.000
y4 0.036 0.051 0.712 0.476
y5 -0.028 0.049 -0.564 0.573
y6 0.002 0.003 0.694 0.488
f2 =~ efa1
y1 0.034 0.045 0.758 0.449
y2 -0.002 0.015 -0.151 0.880
y3 -0.008 0.035 -0.219 0.827
y4 0.763 0.050 15.374 0.000
y5 0.810 0.048 16.796 0.000
y6 0.802 0.041 19.467 0.000
f3 =~
y7 1.000
y8 0.894 0.021 41.936 0.000
y9 0.902 0.021 42.479 0.000
f4 =~
y10 1.000
y11 0.734 0.028 26.424 0.000
y12 0.684 0.028 24.405 0.000
Regressions:
Estimate Std.Err z-value P(>|z|)
f3 ~
f1 0.493 0.058 8.455 0.000
f2 0.721 0.057 12.755 0.000
f4 ~
f3 0.546 0.032 16.975 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
f1 ~~
f2 0.479 0.053 9.072 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.y1 0.376 0.034 11.064 0.000
.y2 0.290 0.035 8.239 0.000
.y3 0.406 0.034 11.817 0.000
.y4 0.408 0.035 11.742 0.000
.y5 0.329 0.033 10.046 0.000
.y6 0.393 0.035 11.073 0.000
.y7 0.183 0.019 9.796 0.000
.y8 0.191 0.017 11.269 0.000
.y9 0.181 0.017 10.812 0.000
.y10 0.240 0.027 8.746 0.000
.y11 0.183 0.017 10.791 0.000
.y12 0.213 0.018 11.998 0.000
f1 1.000
f2 1.000
.f3 0.527 0.049 10.644 0.000
.f4 0.565 0.049 11.488 0.000
Exploratory factor analysis (EFA)
When there is no structural part (i.e., no regressions among the latent variables) and there is only a single EFA block, then ESEM reduces to exploratory factor analysis (EFA). Using the Holzinger and Swineford data, we could specify an EFA with three factors as follows:
<- '
efa.model efa("efa")*f1 +
efa("efa")*f2 +
efa("efa")*f3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
'
<- cfa(efa.model, data = HolzingerSwineford1939)
fit summary(fit, standardized = TRUE)
lavaan 0.6-19 ended normally after 1 iteration
Estimator ML
Optimization method NLMINB
Number of model parameters 39
Row rank of the constraints matrix 6
Rotation method GEOMIN OBLIQUE
Geomin epsilon 0.001
Rotation algorithm (rstarts) GPA (30)
Standardized metric TRUE
Row weights None
Number of observations 301
Model Test User Model:
Test statistic 22.897
Degrees of freedom 12
P-value (Chi-square) 0.029
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
f1 =~ efa
x1 0.712 0.092 7.771 0.000 0.712 0.611
x2 0.628 0.104 6.063 0.000 0.628 0.534
x3 0.796 0.096 8.255 0.000 0.796 0.705
x4 0.011 0.011 0.944 0.345 0.011 0.009
x5 -0.107 0.089 -1.203 0.229 -0.107 -0.083
x6 0.076 0.073 1.028 0.304 0.076 0.069
x7 -0.278 0.109 -2.538 0.011 -0.278 -0.255
x8 0.012 0.008 1.371 0.170 0.012 0.011
x9 0.314 0.076 4.142 0.000 0.314 0.312
f2 =~ efa
x1 0.198 0.103 1.917 0.055 0.198 0.170
x2 0.039 0.092 0.424 0.672 0.039 0.033
x3 -0.106 0.111 -0.963 0.335 -0.106 -0.094
x4 0.981 0.058 16.850 0.000 0.981 0.844
x5 1.153 0.074 15.545 0.000 1.153 0.895
x6 0.886 0.062 14.338 0.000 0.886 0.810
x7 0.011 0.012 0.923 0.356 0.011 0.010
x8 -0.075 0.066 -1.135 0.256 -0.075 -0.074
x9 -0.002 0.007 -0.315 0.753 -0.002 -0.002
f3 =~ efa
x1 0.015 0.048 0.302 0.762 0.015 0.012
x2 -0.166 0.092 -1.813 0.070 -0.166 -0.141
x3 0.002 0.048 0.036 0.971 0.002 0.002
x4 0.004 0.047 0.091 0.927 0.004 0.004
x5 0.012 0.036 0.322 0.747 0.012 0.009
x6 -0.017 0.041 -0.409 0.683 -0.017 -0.015
x7 0.843 0.105 7.999 0.000 0.843 0.775
x8 0.752 0.076 9.893 0.000 0.752 0.744
x9 0.484 0.070 6.954 0.000 0.484 0.481
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
f1 ~~
f2 0.373 0.118 3.173 0.002 0.373 0.373
f3 0.432 0.097 4.465 0.000 0.432 0.432
f2 ~~
f3 0.306 0.081 3.775 0.000 0.306 0.306
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.x1 0.696 0.087 8.038 0.000 0.696 0.513
.x2 1.035 0.102 10.151 0.000 1.035 0.749
.x3 0.692 0.097 7.134 0.000 0.692 0.543
.x4 0.377 0.048 7.902 0.000 0.377 0.279
.x5 0.403 0.061 6.590 0.000 0.403 0.243
.x6 0.365 0.042 8.613 0.000 0.365 0.305
.x7 0.594 0.106 5.624 0.000 0.594 0.502
.x8 0.479 0.080 5.958 0.000 0.479 0.469
.x9 0.551 0.060 9.132 0.000 0.551 0.543
f1 1.000 1.000 1.000
f2 1.000 1.000 1.000
f3 1.000 1.000 1.000
In version 0.6-13, we added added the efa()
function to simplify the input, and to produce output that is more in line with traditional EFA software in R. There is no need to create a model syntax. You only need to provide the data, and the number of factors. Instead of a single number, you can also specify a range of numbers. For example:
<- paste("x", 1:9, sep = "")
var.names <- efa(data = HolzingerSwineford1939[,var.names], nfactors = 1:3) fit
Warning: lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= -7.900395e-19)
is smaller than zero. This may be a symptom that the model is not
identified.
Warning: lavaan->lav_model_vcov():
The variance-covariance matrix of the estimated parameters (vcov) does not
appear to be positive definite! The smallest eigenvalue (= -2.001699e-18)
is smaller than zero. This may be a symptom that the model is not
identified.
summary(fit)
This is lavaan 0.6-19 -- running exploratory factor analysis
Estimator ML
Rotation method GEOMIN OBLIQUE
Geomin epsilon 0.001
Rotation algorithm (rstarts) GPA (30)
Standardized metric TRUE
Row weights None
Number of observations 301
Overview models:
aic bic sabic chisq df pvalue cfi rmsea
nfactors = 1 7738.448 7805.176 7748.091 312.264 27 0.000 0.677 0.187
nfactors = 2 7572.491 7668.876 7586.418 130.306 19 0.000 0.874 0.140
nfactors = 3 7479.081 7601.416 7496.758 22.897 12 0.029 0.988 0.055
Eigenvalues correlation matrix:
ev1 ev2 ev3 ev4 ev5 ev6 ev7 ev8 ev9
3.216 1.639 1.365 0.699 0.584 0.500 0.473 0.286 0.238
Number of factors: 1
Standardized loadings: (* = significant at 1% level)
f1 unique.var communalities
x1 0.438* 0.808 0.192
x2 .* 0.951 0.049
x3 .* 0.950 0.050
x4 0.848* 0.281 0.719
x5 0.841* 0.293 0.707
x6 0.838* 0.298 0.702
x7 .* 0.967 0.033
x8 .* 0.960 0.040
x9 0.307* 0.906 0.094
f1
Sum of squared loadings 2.586
Proportion of total 1.000
Proportion var 0.287
Cumulative var 0.287
Number of factors: 2
Standardized loadings: (* = significant at 1% level)
f1 f2 unique.var communalities
x1 .* 0.430* 0.673 0.327
x2 . .* 0.906 0.094
x3 0.456* 0.783 0.217
x4 0.851* 0.274 0.726
x5 0.868* 0.264 0.736
x6 0.825* 0.302 0.698
x7 0.448* 0.802 0.198
x8 0.627* 0.630 0.370
x9 0.734* 0.458 0.542
f1 f2 total
Sum of sq (obliq) loadings 2.280 1.629 3.909
Proportion of total 0.583 0.417 1.000
Proportion var 0.253 0.181 0.434
Cumulative var 0.253 0.434 0.434
Factor correlations: (* = significant at 1% level)
f1 f2
f1 1.000
f2 0.339 1.000
Number of factors: 3
Standardized loadings: (* = significant at 1% level)
f1 f2 f3 unique.var communalities
x1 0.611* .* 0.513 0.487
x2 0.534* . 0.749 0.251
x3 0.705* 0.543 0.457
x4 0.844* 0.279 0.721
x5 0.895* 0.243 0.757
x6 0.810* 0.305 0.695
x7 .* 0.775* 0.502 0.498
x8 0.744* 0.469 0.531
x9 0.312* 0.481* 0.543 0.457
f2 f3 f1 total
Sum of sq (obliq) loadings 2.215 1.343 1.297 4.855
Proportion of total 0.456 0.277 0.267 1.000
Proportion var 0.246 0.149 0.144 0.539
Cumulative var 0.246 0.395 0.539 0.539
Factor correlations: (* = significant at 1% level)
f1 f2 f3
f1 1.000
f2 0.373 1.000
f3 0.432 0.306 1.000