SMHS BigDataBigSci SEM sem vs cfa

Structural Equation Modeling (SEM) Example 2 (Parkinson’s Disease Data) - sem() vs. cfa()

The function sem() is very similar to the function cfa(). As we did not include fit.measures=TRUE, the report only includes the basic chi-square test statistic. The argument standardized=TRUE, reports standardized parameter values. Two extra columns of standardized parameter values are printed. In the first column (labeled, Std.lv), only the latent variables are standardized. In the second column (labeled Std.all), both latent and observed variables are standardized. The latter is often called the `completely standardized solution'.

# remove all objects from the R console (current workspace)
rm(list = ls())

# library("lavaan")

#load data   05_PPMI_top_UPDRS_Integrated_LongFormat1.csv ( dim(myData) 1764   31 ), long format
# myData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1&verifier=3bYRT9FXgBGMCQv8MNxsclWnMgodiJRYo3ODFtDq",header=TRUE)
# dichotomize the "ResearchGroup" variable

myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)

# library("MASS")
myData2<-scale(myData); head(myData2)
myData2[, 20] <- myData$\$$ResearchGroup;  # replace the ResearchGroup by orig binary labels (do not rescale)
 attach(myData2)
 head(myData2)

 # model3 <-
    '
 # latent variable definitions - defining how the latent variables are “manifested by” a set of observed 
 # (or manifest) variables, aka “indicators”
 # (1) Measurement Model 
 # Imaging =~ L_cingulate_gyrus_ComputeArea + cerebellum_Volume
 Imaging =~  R_insular_cortex_ComputeArea + R_insular_cortex_Volume
 DemoGeno =~ Weight+Sex+Age
 # UPDRS =~ UPDRS_Part_I_Summary_Score_Baseline+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline
 UPDRS =~  UPDRS_part_I  +UPDRS_part_II + UPDRS_part_III
 # (2) Regressions 
 ResearchGroup ~ Imaging + DemoGeno + UPDRS 
 '

Here is why we needed to scale the data first before we fit the SEM model:

 # using raw data provides unreliable estimates
 <font color="red"># some observed variances are (at least) a factor 1000 times larger than others;</font>
 fit3 <- sem(model3, data=<b>myData</b>, estimator="MLM")
 summary(fit3)

 # using scaled data provides stable estimates
 fit3 <- sem(model3, data=myData2, estimator="MLM")
 summary(fit3)

 # report the standardized coefficients of the model
 standardizedSolution(fit3)

 # variation explained by model components (The R-square value for all endogenous variables)
 inspect(fit3, "r2")

Note that all variances are supposed to be positive, however, occasionally, model estimates may generate a residual variance that is negative.  This may happen due to random sampling error where a very small true value may sometimes produce a negative estimate, or it can occur when the model is unstable.

 # Inspect the fitted values variance-covariance matrix:
 fitted(fit3)$\$$cov

#
Model fitting
Name	Command
fit CFA to data	cfa(model, data=Data)
fit SEM to data	sem(model, data=Data)
standardized solution	sem(model, data=Data, std.ov=TRUE)
orthogonal factors	cfa(model, data=Data, orthogonal=TRUE)
Matrices
Name	Command
Factor covariance matrix	inspect(fit, "coefficients")$\$$psi
 Fitted covariance matrix	fitted(fit)$\$$cov
Observed covariance matrix	inspect(fit, 'sampstat')$\$$cov
 Residual covariance matrix	resid(fit)$\$$cov
Factor correlation matrix	cov2cor(inspect(fit, "coefficients")$\$$psi) or use covariance command with standardised solution e.g., cfa(..., std.ov=TRUE)
 Fit Measures
 <b>Name	Command</b>
 Fit measures:	fitMeasures(fit)
 Specific fit measures e.g.:	fitMeasures(fit)[c('chisq', 'df', 'pvalue', 'cfi', 'rmsea', 'srmr')]
 Parameters
 <b>Name	Command</b>
 Parameter information	parTable(fit)
 Standardised estimates	standardizedSolution(fit) or summary(fit, standardized=TRUE)
 R-squared | inspect(fit, 'r2')
 Compare models
 <b>Name	Command</b>
 Compare fit measures	cbind(m1=inspect(m1_fit, 'fit.measures'), m2=inspect(m2_fit, 'fit.measures'))
 Chi-square difference test	anova(m1_fit, m2_fit)
 Model improvement
 <b>Name	Command</b>
 Modification indices	mod_ind <- modificationindices(fit)
 10 greatest	head(mod_ind[order(mod_ind$\$$mi, decreasing=TRUE), ], 10)
mi > 5	subset(mod_ind[order(mod_ind$\$$mi, decreasing=TRUE), ], mi > 5)

# to account for groupings (gender)
# fit3 <- sem(model3, data=myData, group="Sex", estimator="MLM")

# install.packages("semPlot")
library(semPlot)

# Plot standardized model (numerical):
# semPaths(fit3, what = "est", layout = "tree", title = TRUE, style = "LISREL")
semPaths(fit3)

(default)	Deutsch	Español	Français	Italiano	Português	日本語	България	الامارات العربية المتحدة	Suomi	इस भाषा में	Norge
한국어	中文	繁体中文	Русский	Nederlands	Ελληνικά	Hrvatska	Česká republika	Danmark	Polska	România	Sverige

SMHS BigDataBigSci SEM sem vs cfa

Structural Equation Modeling (SEM) Example 2 (Parkinson’s Disease Data) - sem() vs. cfa()

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools