SMHS BigDataBigSci SEM sem vs cfa

From SOCR
Revision as of 12:06, 5 May 2016 by Dinov (talk | contribs) (See also)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Structural Equation Modeling (SEM) Example 2 (Parkinson’s Disease Data) - sem() vs. cfa()

The function sem() is very similar to the function cfa(). As we did not include fit.measures=TRUE, the report only includes the basic chi-square test statistic. The argument standardized=TRUE, reports standardized parameter values. Two extra columns of standardized parameter values are printed. In the first column (labeled, Std.lv), only the latent variables are standardized. In the second column (labeled Std.all), both latent and observed variables are standardized. The latter is often called the `completely standardized solution'.

# remove all objects from the R console (current workspace)
rm(list = ls())
# library("lavaan")
#load data   05_PPMI_top_UPDRS_Integrated_LongFormat1.csv ( dim(myData) 1764   31 ), long format
# myData <- read.csv("https://umich.instructure.com/files/330397/download?download_frd=1&verifier=3bYRT9FXgBGMCQv8MNxsclWnMgodiJRYo3ODFtDq",header=TRUE)
# dichotomize the "ResearchGroup" variable
myData$\$$ResearchGroup <- ifelse(myData$\$$ResearchGroup == "Control", 1, 0)
# library("MASS")
myData2<-scale(myData); head(myData2)
myData2[, 20] <- myData$\$$ResearchGroup;  # replace the ResearchGroup by orig binary labels (do not rescale)
 attach(myData2)
 head(myData2)

 # model3 <-
    '
 # latent variable definitions - defining how the latent variables are “manifested by” a set of observed 
 # (or manifest) variables, aka “indicators”
 # (1) Measurement Model 
 # Imaging =~ L_cingulate_gyrus_ComputeArea + cerebellum_Volume
 Imaging =~  R_insular_cortex_ComputeArea + R_insular_cortex_Volume
 DemoGeno =~ Weight+Sex+Age
 # UPDRS =~ UPDRS_Part_I_Summary_Score_Baseline+X_Assessment_Non.Motor_Geriatric_Depression_Scale_GDS_Short_Summary_Score_Baseline
 UPDRS =~  UPDRS_part_I  +UPDRS_part_II + UPDRS_part_III
 # (2) Regressions 
 ResearchGroup ~ Imaging + DemoGeno + UPDRS 
 '

Here is why we needed to scale the data first before we fit the SEM model:

 # using raw data provides unreliable estimates
 <font color="red"># some observed variances are (at least) a factor 1000 times larger than others;</font>
 fit3 <- sem(model3, data=<b>myData</b>, estimator="MLM")
 summary(fit3)

 # using scaled data provides stable estimates
 fit3 <- sem(model3, data=myData2, estimator="MLM")
 summary(fit3)

 # report the standardized coefficients of the model
 standardizedSolution(fit3)

 # variation explained by model components (The R-square value for all endogenous variables)
 inspect(fit3, "r2")

Note that all variances are supposed to be positive, however, occasionally, model estimates may generate a residual variance that is negative.  This may happen due to random sampling error where a very small true value may sometimes produce a negative estimate, or it can occur when the model is unstable.

 # Inspect the fitted values variance-covariance matrix:
 fitted(fit3)$\$$cov
#
Model fitting
Name	Command
fit CFA to data	cfa(model, data=Data)
fit SEM to data	sem(model, data=Data)
standardized solution	sem(model, data=Data, std.ov=TRUE)
orthogonal factors	cfa(model, data=Data, orthogonal=TRUE)
Matrices
Name	Command
Factor covariance matrix	inspect(fit, "coefficients")$\$$psi
 Fitted covariance matrix	fitted(fit)$\$$cov
Observed covariance matrix	inspect(fit, 'sampstat')$\$$cov
 Residual covariance matrix	resid(fit)$\$$cov
Factor correlation matrix	cov2cor(inspect(fit, "coefficients")$\$$psi) or use covariance command with standardised solution e.g., cfa(..., std.ov=TRUE)
 Fit Measures
 <b>Name	Command</b>
 Fit measures:	fitMeasures(fit)
 Specific fit measures e.g.:	fitMeasures(fit)[c('chisq', 'df', 'pvalue', 'cfi', 'rmsea', 'srmr')]
 Parameters
 <b>Name	Command</b>
 Parameter information	parTable(fit)
 Standardised estimates	standardizedSolution(fit) or summary(fit, standardized=TRUE)
 R-squared | inspect(fit, 'r2')
 Compare models
 <b>Name	Command</b>
 Compare fit measures	cbind(m1=inspect(m1_fit, 'fit.measures'), m2=inspect(m2_fit, 'fit.measures'))
 Chi-square difference test	anova(m1_fit, m2_fit)
 Model improvement
 <b>Name	Command</b>
 Modification indices	mod_ind <- modificationindices(fit)
 10 greatest	head(mod_ind[order(mod_ind$\$$mi, decreasing=TRUE), ], 10)
mi > 5	subset(mod_ind[order(mod_ind$\$$mi, decreasing=TRUE), ], mi > 5)
# to account for groupings (gender)
# fit3 <- sem(model3, data=myData, group="Sex", estimator="MLM")
# install.packages("semPlot")
library(semPlot)
# Plot standardized model (numerical):
# semPaths(fit3, what = "est", layout = "tree", title = TRUE, style = "LISREL")
semPaths(fit3)

SMHS BigDataBigSci7.png


See also




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif