c("item1",
Data[,"item2",
"item3",
"item4")] <-
lapply(Data[,c("item1",
"item2",
"item3",
"item4")], ordered)
Categorical data
Binary, ordinal and nominal variables are considered categorical (not continuous). It makes a big difference if these categorical variables are exogenous (independent) or endogenous (dependent) in the model.
Exogenous categorical variables
If you have a binary exogenous covariate (say, gender), all you need to do is to recode it as a dummy (0/1) variable. Just like you would do in a classic regression model. If you have an exogenous ordinal variable, you can use a coding scheme reflecting the order (say, 1,2,3,…) and treat it as any other (numeric) covariate. If you have a nominal categorical variable with \(K > 2\) levels, you need to replace it by a set of \(K-1\) dummy variables, again, just like you would do in classical regression.
Endogenous categorical variables
The lavaan 0.5 series can deal with binary and ordinal (but not nominal) endogenous variables. There are two ways to communicate to lavaan that some of the endogenous variables are to be treated as categorical:
declare them as ‘ordered’ (using the
ordered
function, which is part of base R) in your data.frame before you run the analysis; for example, if you need to declare four variables (say,item1
,item2
,item3
,item4
) as ordinal in your data.frame (calledData
), you can use something like:use the
ordered
argument when using one of the fitting functions (cfa/sem/growth/lavaan), for example, if you have four binary or ordinal variables (say,item1
,item2
,item3
,item4
), you can use:<- cfa(myModel, data = myData, fit ordered = c("item1","item2", "item3","item4"))
If all the (endogenous) variables are to be treated as categorical, you can use ordered = TRUE
as a shortcut.
When the ordered=
argument is used, lavaan will automatically switch to the WLSMV
estimator: it will use diagonally weighted least squares (DWLS
) to estimate the model parameters, but it will use the full weight matrix to compute robust standard errors, and a mean- and variance-adjusted test statistic. Other options are unweighted least squares (ULSMV), or pairwise maximum likelihood (PML). Full information maximum likelihood is currently not supported.