Estimators and more

Estimators

If all data is continuous, the default estimator in the lavaan package is maximum likelihood (estimator = "ML"). Alternative estimators available in lavaan are:

"GLS": generalized least squares. For complete data only.
"WLS": weighted least squares (sometimes called ADF estimation). For complete data only.
"DWLS": diagonally weighted least squares
"ULS": unweighted least squares
"DLS": distributionally-weighted least squares
"PML": pairwise maximum likelihood

Many estimators have ‘robust’ variants, meaning that they provide robust standard errors and a scaled test statistic. For example, for the maximum likelihood estimator, lavaan provides the following robust variants:

"MLM": maximum likelihood estimation with robust standard errors and a Satorra-Bentler scaled test statistic. For complete data only.
"MLMVS": maximum likelihood estimation with robust standard errors and a mean- and variance-adjusted test statistic (aka the Satterthwaite approach). For complete data only.
"MLMV": maximum likelihood estimation with robust standard errors and a mean- and variance-adjusted test statistic (using a scale-shifted approach). For complete data only.
"MLF": for maximum likelihood estimation with standard errors based on the first-order derivatives, and a conventional test statistic. For both complete and incomplete data.
"MLR": maximum likelihood estimation with robust (Huber-White) standard errors and a scaled test statistic that is (asymptotically) equal to the Yuan-Bentler test statistic. For both complete and incomplete data.

For the DWLS and ULS estimators, lavaan also provides ‘robust’ variants: WLSM, WLSMVS, WLSMV, ULSM, ULSMVS, ULSMV. Note that for the robust WLS variants, we use the diagonal of the weight matrix for estimation, but we use the full weight matrix to correct the standard errors and to compute the test statistic.

ML estimation: Wishart versus Normal

If maximum likelihood estimation is used ("ML" or any of its robust variants), the default behavior of lavaan is to base the analysis on the so-called biased sample covariance matrix, where the elements are divided by N instead of N-1. This is done internally, and should not be done by the user. In addition, the chi-square statistic is computed by multiplying the minimum function value with a factor N (instead of N-1). If you prefer to use an unbiased covariance matrix, and \(N-1\) as the multiplier to compute the chi-square statistic, you need to specify the likelihood = "wishart" argument when calling the fitting functions. For example:

fit <- cfa(HS.model, 
           data = HolzingerSwineford1939, 
           likelihood = "wishart")
fit

lavaan 0.7-2 ended normally after 35 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        21

  Number of observations                           301

Model Test User Model:
                                                      
  Test statistic                                85.022
  Degrees of freedom                                24
  P-value (Chi-square)                           0.000

The value of the test statistic will be closer to the value reported by programs like EQS, LISREL or AMOS, since they all use the ‘Wishart’ approach when using the maximum likelihood estimator. The program Mplus, on the other hand, uses the ‘normal’ approach to maximum likelihood estimation.

Missing values

If the data contain missing values, the default behavior is listwise deletion. If the missing mechanism is MCAR (missing completely at random) or MAR (missing at random), the lavaan package provides case-wise (or ‘full information’) maximum likelihood estimation. You can turn this feature on, by using the argument missing = "ML" when calling the fitting function. An unrestricted (h1) model will automatically be estimated, so that all common fit indices are available.

Standard errors

Standard errors are (by default) based on the expected information matrix. The only exception is when data are missing and full information ML is used (via missing = "ML"). In this case, the observed information matrix is used to compute the standard errors. The user can change this behavior by using the information argument.

Robust standard errors can be requested explicitly by using se = "robust". Similarly, robust test statistics can be requested explicitly by using test = "robust". Many more options are possible. See the help page:

?lavOptions

Bootstrapping

There are two ways to use the bootstrap in lavaan. Either you can set se = "bootstrap" or test = "bootstrap" when fitting the model (and you will get bootstrap standard errors, and/or a bootstrap-based p-value respectively), or you can use the bootstrapLavaan() function, which needs an already fitted lavaan object. The latter function can be used to ‘bootstrap’ any statistic (or vector of statistics) that you can extract from a fitted lavaan object.