# ESEM and EFA

If a measurement model contains multiple latent variables (factors), we usually know which indicators belong to each factor. We call this the factor structure. Confirmatory factor analysis can be used to check if this a priori factor structure holds in the data. There are settings, however, where the factor structure is unclear, and we wish to rotate the solution in order to find a suitable structure in a given model. When the model also includes a structural part (i.e., regressions among the latent variables), this is referred to as exploratory structural equation modeling or ESEM. If there is only a measurement part, this is called exploratory factor analysis (EFA). What they have in common is that the factor structure (for one or more blocks) is found by means of rotation.

## ESEM

To illustrate how ESEM works in lavaan, consider the following syntax:

``````model <- '
# efa block 1
efa("efa1")*f1 +
efa("efa1")*f2 =~ x1 + x2 + x3 + x4 + x5 + x6

# efa block 2
efa("efa2")*f3 +
efa("efa2")*f4 =~ y1 + y2 + y3 + y4 + y5 + y6

# cfa block
f5 =~ z7 + z8 + z9
f6 =~ z10 + z11 + z12

# regressions
f3 ~ f1 + f2
f4 ~ f3
'``````

This model syntax defines six latent variables (or factors). For f5 and f6, the factor structure is known, and they belong to a regular CFA block. But for f1 and f2, the factor structure is not known, and we will use a rotation method to find an appropiate structure. The f1 and f2 factors belong together in an EFA block that is (arbitrarily) named `efa1`. The `efa("efa1")*` modifier just before f1 and f2 is used to alert lavaan that these two factors belong to the same EFA block. The factors f3 and f4 belong to a different EFA block (named `efa2`) and will be rotated independently. The structural part of the model is given as usual. To fit this model, we could call the `sem()` function as follows:

``fit <- sem(model = model, data = myData, rotation = "geomin")``

Different rotation criteria are available, and many rotation options can be provided (see the manual page for the `efa()` function for an overview).

To illustrate ESEM, we will borrow an example from the Mplus User’s Guide (example 5.25). First we read in the data:

``````ex5_25 <- read.table("http://statmodel.com/usersguide/chap5/ex5.25.dat")
names(ex5_25) = paste0("y",1:12)``````

The model syntax contains a single EFA block (`efa1` for factors f1 and f2) and single CFA block (for f3 and f4):

``````model <- '
# efa block
efa("efa1")*f1 +
efa("efa1")*f2 =~ y1 + y2 + y3 + y4 + y5 + y6

# cfa block
f3 =~ y7 + y8 + y9
f4 =~ y10 + y11 + y12

# regressions
f3 ~ f1 + f2
f4 ~ f3
'``````

The following command illustrates the use of various rotation arguments:

``````fit <- sem(model = model, data = ex5_25, rotation = "geomin",
# mimic Mplus
information = "observed",
rotation.args = list(rstarts = 30, row.weights = "none",
algorithm = "gpa", std.ov = TRUE,
geomin.epsilon = 0.0001))
summary(fit)``````
``````lavaan 0.6.17.1884 ended normally after 35 iterations

Estimator                                         ML
Optimization method                           NLMINB
Number of model parameters                        32

Rotation method                       GEOMIN OBLIQUE
Geomin epsilon                                 1e-04
Rotation algorithm (rstarts)                GPA (30)
Standardized metric                             TRUE
Row weights                                     None

Number of observations                           500

Model Test User Model:

Test statistic                                51.353
Degrees of freedom                                46
P-value (Chi-square)                           0.272

Parameter Estimates:

Standard errors                             Standard
Information                                 Observed
Observed information based on                Hessian

Latent Variables:
Estimate  Std.Err  z-value  P(>|z|)
f1 =~ efa1
y1                0.751    0.048   15.621    0.000
y2                0.858    0.042   20.469    0.000
y3                0.736    0.045   16.343    0.000
y4                0.036    0.051    0.712    0.476
y5               -0.028    0.049   -0.564    0.573
y6                0.002    0.003    0.694    0.488
f2 =~ efa1
y1                0.034    0.045    0.758    0.449
y2               -0.002    0.015   -0.151    0.880
y3               -0.008    0.035   -0.219    0.827
y4                0.763    0.050   15.374    0.000
y5                0.810    0.048   16.796    0.000
y6                0.802    0.041   19.467    0.000
f3 =~
y7                1.000
y8                0.894    0.021   41.936    0.000
y9                0.902    0.021   42.479    0.000
f4 =~
y10               1.000
y11               0.734    0.028   26.424    0.000
y12               0.684    0.028   24.405    0.000

Regressions:
Estimate  Std.Err  z-value  P(>|z|)
f3 ~
f1                0.493    0.058    8.455    0.000
f2                0.721    0.057   12.755    0.000
f4 ~
f3                0.546    0.032   16.975    0.000

Covariances:
Estimate  Std.Err  z-value  P(>|z|)
f1 ~~
f2                0.479    0.053    9.072    0.000

Variances:
Estimate  Std.Err  z-value  P(>|z|)
.y1                0.376    0.034   11.064    0.000
.y2                0.290    0.035    8.239    0.000
.y3                0.406    0.034   11.817    0.000
.y4                0.408    0.035   11.742    0.000
.y5                0.329    0.033   10.046    0.000
.y6                0.393    0.035   11.073    0.000
.y7                0.183    0.019    9.796    0.000
.y8                0.191    0.017   11.269    0.000
.y9                0.181    0.017   10.812    0.000
.y10               0.240    0.027    8.746    0.000
.y11               0.183    0.017   10.791    0.000
.y12               0.213    0.018   11.998    0.000
f1                1.000
f2                1.000
.f3                0.527    0.049   10.644    0.000
.f4                0.565    0.049   11.488    0.000``````

## Exploratory factor analysis (EFA)

When there is no structural part (i.e., no regressions among the latent variables) and there is only a single EFA block, then ESEM reduces to exploratory factor analysis (EFA). Using the Holzinger and Swineford data, we could specify an EFA with three factors as follows:

``````efa.model <- '
efa("efa")*f1 +
efa("efa")*f2 +
efa("efa")*f3 =~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9
'
fit <- cfa(efa.model, data = HolzingerSwineford1939)
summary(fit, standardized = TRUE)``````
``````lavaan 0.6.17.1884 ended normally after 1 iteration

Estimator                                         ML
Optimization method                           NLMINB
Number of model parameters                        33

Rotation method                       GEOMIN OBLIQUE
Geomin epsilon                                 0.001
Rotation algorithm (rstarts)                GPA (30)
Standardized metric                             TRUE
Row weights                                     None

Number of observations                           301

Model Test User Model:

Test statistic                                22.897
Degrees of freedom                                12
P-value (Chi-square)                           0.029

Parameter Estimates:

Standard errors                             Standard
Information                                 Expected
Information saturated (h1) model          Structured

Latent Variables:
Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
f1 =~ efa
x1                0.712    0.092    7.771    0.000    0.712    0.611
x2                0.628    0.104    6.063    0.000    0.628    0.534
x3                0.796    0.096    8.255    0.000    0.796    0.705
x4                0.011    0.011    0.944    0.345    0.011    0.009
x5               -0.107    0.089   -1.203    0.229   -0.107   -0.083
x6                0.076    0.073    1.028    0.304    0.076    0.069
x7               -0.278    0.109   -2.538    0.011   -0.278   -0.255
x8                0.012    0.008    1.371    0.170    0.012    0.011
x9                0.314    0.076    4.142    0.000    0.314    0.312
f2 =~ efa
x1                0.198    0.103    1.917    0.055    0.198    0.170
x2                0.039    0.092    0.424    0.672    0.039    0.033
x3               -0.106    0.111   -0.963    0.335   -0.106   -0.094
x4                0.981    0.058   16.850    0.000    0.981    0.844
x5                1.153    0.074   15.545    0.000    1.153    0.895
x6                0.886    0.062   14.338    0.000    0.886    0.810
x7                0.011    0.012    0.923    0.356    0.011    0.010
x8               -0.075    0.066   -1.135    0.256   -0.075   -0.074
x9               -0.002    0.007   -0.315    0.753   -0.002   -0.002
f3 =~ efa
x1                0.015    0.048    0.302    0.762    0.015    0.012
x2               -0.166    0.092   -1.813    0.070   -0.166   -0.141
x3                0.002    0.048    0.036    0.971    0.002    0.002
x4                0.004    0.047    0.091    0.927    0.004    0.004
x5                0.012    0.036    0.322    0.747    0.012    0.009
x6               -0.017    0.041   -0.409    0.683   -0.017   -0.015
x7                0.843    0.105    7.999    0.000    0.843    0.775
x8                0.752    0.076    9.893    0.000    0.752    0.744
x9                0.484    0.070    6.954    0.000    0.484    0.481

Covariances:
Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
f1 ~~
f2                0.373    0.118    3.173    0.002    0.373    0.373
f3                0.432    0.097    4.465    0.000    0.432    0.432
f2 ~~
f3                0.306    0.081    3.775    0.000    0.306    0.306

Variances:
Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
.x1                0.696    0.087    8.038    0.000    0.696    0.513
.x2                1.035    0.102   10.151    0.000    1.035    0.749
.x3                0.692    0.097    7.134    0.000    0.692    0.543
.x4                0.377    0.048    7.902    0.000    0.377    0.279
.x5                0.403    0.061    6.590    0.000    0.403    0.243
.x6                0.365    0.042    8.613    0.000    0.365    0.305
.x7                0.594    0.106    5.624    0.000    0.594    0.502
.x8                0.479    0.080    5.958    0.000    0.479    0.469
.x9                0.551    0.060    9.132    0.000    0.551    0.543
f1                1.000                               1.000    1.000
f2                1.000                               1.000    1.000
f3                1.000                               1.000    1.000``````

In version 0.6-13, we added added the `efa()` function to simplify the input, and to produce output that is more in line with traditional EFA software in R. There is no need to create a model syntax. You only need to provide the data, and the number of factors. Instead of a single number, you can also specify a range of numbers. For example:

``````var.names <- paste("x", 1:9, sep = "")
fit <- efa(data = HolzingerSwineford1939[,var.names], nfactors = 1:3)
summary(fit)``````
``````This is lavaan 0.6.17.1884 -- running exploratory factor analysis

Estimator                                         ML
Rotation method                       GEOMIN OBLIQUE
Geomin epsilon                                 0.001
Rotation algorithm (rstarts)                GPA (30)
Standardized metric                             TRUE
Row weights                                     None

Number of observations                           301

Overview models:
aic      bic    sabic   chisq df pvalue   cfi rmsea
nfactors = 1 7738.448 7805.176 7748.091 312.264 27  0.000 0.677 0.187
nfactors = 2 7572.491 7668.876 7586.418 130.306 19  0.000 0.874 0.140
nfactors = 3 7479.081 7601.416 7496.758  22.897 12  0.029 0.988 0.055

Eigenvalues correlation matrix:

ev1     ev2     ev3     ev4     ev5     ev6     ev7     ev8     ev9
3.216   1.639   1.365   0.699   0.584   0.500   0.473   0.286   0.238

Number of factors:  1

f1       unique.var   communalities
x1  0.438*           0.808           0.192
x2      .*           0.951           0.049
x3      .*           0.950           0.050
x4  0.848*           0.281           0.719
x5  0.841*           0.293           0.707
x6  0.838*           0.298           0.702
x7      .*           0.967           0.033
x8      .*           0.960           0.040
x9  0.307*           0.906           0.094

f1
Proportion of total     1.000
Proportion var          0.287
Cumulative var          0.287

Number of factors:  2

f1      f2       unique.var   communalities
x1      .*  0.430*           0.673           0.327
x2      .       .*           0.906           0.094
x3          0.456*           0.783           0.217
x4  0.851*                   0.274           0.726
x5  0.868*                   0.264           0.736
x6  0.825*                   0.302           0.698
x7          0.448*           0.802           0.198
x8          0.627*           0.630           0.370
x9          0.734*           0.458           0.542

f1    f2 total
Proportion of total        0.583 0.417 1.000
Proportion var             0.253 0.181 0.434
Cumulative var             0.253 0.434 0.434

Factor correlations: (* = significant at 1% level)

f1      f2
f1  1.000
f2  0.339*  1.000

Number of factors:  3

f1      f2      f3       unique.var   communalities
x1  0.611*      .                    0.513           0.487
x2  0.534*              .            0.749           0.251
x3  0.705*                           0.543           0.457
x4          0.844*                   0.279           0.721
x5          0.895*                   0.243           0.757
x6          0.810*                   0.305           0.695
x7      .           0.775*           0.502           0.498
x8                  0.744*           0.469           0.531
x9  0.312*          0.481*           0.543           0.457

f2    f3    f1 total