Difference between revisions of "SMHS BigDataBigSci SEM"

From SOCR
Jump to: navigation, search
(Notation)
(See also)
 
(3 intermediate revisions by 2 users not shown)
Line 6: Line 6:
  
 
===SEM Advantages===
 
===SEM Advantages===
<li>It allows testing models with multiple dependent variables</li>
+
* It allows testing models with multiple dependent variables
<li>Provides mechanisms for modeling mediating variables</li>
+
* Provides mechanisms for modeling mediating variables
<li>Enables modeling of error terms</li>
+
* Enables modeling of error terms
<li>Facilitates modeling of challenging data  (longitudinal with auto-correlated errors, multi-level data, non-normal data, incomplete data)</li>
+
* acilitates modeling of challenging data  (longitudinal with auto-correlated errors, multi-level data, non-normal data, incomplete data)
  
 
SEM allows separation of observed and latent variables. Other standard statistical procedures may be viewed as special cases of SEM, where statistical significance less important, than in other techniques, and covariances are the core of structural equation models.
 
SEM allows separation of observed and latent variables. Other standard statistical procedures may be viewed as special cases of SEM, where statistical significance less important, than in other techniques, and covariances are the core of structural equation models.
  
 
===Definitions===
 
===Definitions===
<li>The <b>disturbance</b>, <i>D</i>, is the variance in Y unexplained by a variable X that is assumed to affect Y. </li>
+
*The <b>disturbance</b>, <i>D</i>, is the variance in Y unexplained by a variable X that is assumed to affect Y.
 
  X    →    Y  ←    D
 
  X    →    Y  ←    D
  
<li><b>Measurement error</b>, <i>E</i>, is the variance in X unexplained by A, where X is an observed variable that is presumed to measure a latent variable, <i>A</i>. </li>
+
* <b>Measurement error</b>, <i>E</i>, is the variance in X unexplained by A, where X is an observed variable that is presumed to measure a latent variable, <i>A</i>.
 
  A    →    X  ←    E
 
  A    →    X  ←    E
  
<li>Categorical variables in a model are <b>exogenous</b> (independent) or <b>endogenous</b> (dependent).</li>
+
* Categorical variables in a model are <b>exogenous</b> (independent) or <b>endogenous</b> (dependent).
  
 
===Notation===
 
===Notation===
  
<li>In SEM <b>observed (or manifest) indicators</b> are represented by <b>squares/rectangles</b> whereas latent variables (or factors) represented by circles/ovals.</li>
+
* In SEM <b>observed (or manifest) indicators</b> are represented by <b>squares/rectangles</b> whereas latent variables (or factors) represented by circles/ovals.
  
 
<center>[[Image:SMHS_BigDataBigSci1.png|500px]]</center>
 
<center>[[Image:SMHS_BigDataBigSci1.png|500px]]</center>
  
<mark> PLEASE FIX LAST ARROW *'''Relations: Direct effects''' (&rarr;), '''Reciprocal effects''' (&harr; or &#8646;), and '''Correlation or covariance''' ( ARROW ) all have different appearance in SEM models.</mark>
+
*'''Relations: Direct effects''' (&rarr;), '''Reciprocal effects''' (&harr; or &#8646;), and '''Correlation or covariance''' (&#x293B; or &#x293A;) all have different appearance in SEM models.
  
 
===Model Components===
 
===Model Components===
Line 38: Line 38:
 
===Notes===
 
===Notes===
  
<li> Sample-size considerations: mostly same as for regression - more is always better</li>
+
* Sample-size considerations: mostly same as for regression - more is always better.
<li> Model assessment strategies: Chi-square test, Comparative Fit Index, Root Mean Square Error, Tucker Lewis Index, Goodness of Fit Index, AIC, and BIC.</li>
+
* Model assessment strategies: Chi-square test, Comparative Fit Index, Root Mean Square Error, Tucker Lewis Index, Goodness of Fit Index, AIC, and BIC.>
<li> Choice for number of Indicator variables: depends on pilot data analyses, a priori concerns, fewer is better.</li>
+
* Choice for number of Indicator variables: depends on pilot data analyses, a priori concerns, fewer is better.
  
 
===[[SMHS_BigDataBigSci_SEM_Ex1|Hands-on Example 1 (School Kids Mental Abilities)]]===
 
===[[SMHS_BigDataBigSci_SEM_Ex1|Hands-on Example 1 (School Kids Mental Abilities)]]===
Line 46: Line 46:
  
 
===[[SMHS_BigDataBigSci_SEM_Ex2|Hands-on Example 2 (Parkinson’s Disease data)]]===
 
===[[SMHS_BigDataBigSci_SEM_Ex2|Hands-on Example 2 (Parkinson’s Disease data)]]===
 
  
 
==See also==
 
==See also==
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]  
+
* [[SMHS_BigDataBigSci| Back to Model-based Analytics]]
 +
* [[SMHS_BigDataBigSci_SEM_sem_vs_cfa| Differences and Similarities between '''sem'''() and '''cfa'''() ]]  
 
* [[SMHS_BigDataBigSci_GCM| Next Section: Growth Curve Modeling]]
 
* [[SMHS_BigDataBigSci_GCM| Next Section: Growth Curve Modeling]]
* [[SMHS_BigDataBigSci_GEE| Next Section: Generalized Estimating Equation (GEE) Modeling]]
+
* [[SMHS_BigDataBigSci_GCM| Next Section: Generalized Estimating Equation (GEE) Modeling]]
  
 
<hr>
 
<hr>

Latest revision as of 10:29, 24 May 2016

Model-based Analytics - Structural Equation Modeling (SEM)

SEM allow re-parameterization of random-effects to specify latent variables that may affect measures at different time points using structural equations. SEM show variables having predictive (possibly causal) effects on other variables (denoted by arrows) where coefficients index the strength and direction of predictive relations. SEM does not offer much more than what classical regression methods do, but it does allow simultaneous estimation of multiple equations modeling complementary relations.

SEM is a general multivariate statistical analysis technique that can be used for causal modeling/inference, path analysis, confirmatory factor analysis (CFA), covariance structure modeling, and correlation structure modeling.

SEM Advantages

  • It allows testing models with multiple dependent variables
  • Provides mechanisms for modeling mediating variables
  • Enables modeling of error terms
  • acilitates modeling of challenging data (longitudinal with auto-correlated errors, multi-level data, non-normal data, incomplete data)

SEM allows separation of observed and latent variables. Other standard statistical procedures may be viewed as special cases of SEM, where statistical significance less important, than in other techniques, and covariances are the core of structural equation models.

Definitions

  • The disturbance, D, is the variance in Y unexplained by a variable X that is assumed to affect Y.
X    →    Y   ←     D
  • Measurement error, E, is the variance in X unexplained by A, where X is an observed variable that is presumed to measure a latent variable, A.
A    →    X   ←     E
  • Categorical variables in a model are exogenous (independent) or endogenous (dependent).

Notation

  • In SEM observed (or manifest) indicators are represented by squares/rectangles whereas latent variables (or factors) represented by circles/ovals.
SMHS BigDataBigSci1.png
  • Relations: Direct effects (→), Reciprocal effects (↔ or ⇆), and Correlation or covariance (⤻ or ⤺) all have different appearance in SEM models.

Model Components

The measurement part of SEM model deals with the latent variables and their indicators. A pure measurement model is a confirmatory factor analysis (CFA) model with unmeasured covariance (bidirectional arrows) between each possible pair of latent variables. There are straight arrows from the latent variables to their respective indicators and straight arrows from the error and disturbance terms to their respective variables, but no direct effects (straight arrows) connecting the latent variables. The measurement model is evaluated using goodness of fit measures (Chi-Square test, BIC, AIC, etc.) Validation of the measurement model is always first.

Then we proceed to the structural model (including a set of exogenous and endogenous variables together with the direct effects (straight arrows) connecting them along with the disturbance and error terms for these variables that reflect the effects of unmeasured variables not in the model).

Notes

  • Sample-size considerations: mostly same as for regression - more is always better.
  • Model assessment strategies: Chi-square test, Comparative Fit Index, Root Mean Square Error, Tucker Lewis Index, Goodness of Fit Index, AIC, and BIC.>
  • Choice for number of Indicator variables: depends on pilot data analyses, a priori concerns, fewer is better.

Hands-on Example 1 (School Kids Mental Abilities)

Hands-on Example 2 (Parkinson’s Disease data)

See also




Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif