Difference between revisions of "SMHS FixedRandomMixedModels"
(→Theory) |
|||
(7 intermediate revisions by the same user not shown) | |||
Line 9: | Line 9: | ||
===Theory=== | ===Theory=== | ||
− | + | '''1) Fixed effects model:''' a statistical model that represents the observed quantities in terms of explanatory variables treated as if they are non-random. The fixed effects estimator is used to refer to an estimator for the coefficients in the regression model and we impose time independent effects for each entity when we assume fixed effects for our model. | |
*The fixed effect model can be used to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed by differencing. | *The fixed effect model can be used to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed by differencing. | ||
*The fixed effect assumption: the individual specific effect is correlated with the independent variables. This is the priority criteria in using the fixed effect model. | *The fixed effect assumption: the individual specific effect is correlated with the independent variables. This is the priority criteria in using the fixed effect model. | ||
− | *The fixed effect model: consider the linear unobserved effects model for $N$ observations and $T$ time periods: $y_{it}=X_{it}\beta + \alpha_{i} + | + | *The fixed effect model: consider the linear unobserved effects model for $N$ observations and $T$ time periods: $y_{it}=X_{it}\beta + \alpha_{i} + \mu_{it}, for t = 1, 2, \cdots, T$ and $i = 1, 2, \cdots, N$, where $y_{it}$ is the dependent variable observed for individual $i$ and time $t$, $X_{it}$ is the time variant $1xk$ regressor matrix, $\alpha_{i}$ is the unobserved time invariant individual effect and $u_{it}$ is the error term. $\alpha_{i}$ can’t be observed by the econometrician, common examples for time invariant effects $\alpha_{i}$ are innate ability for individuals or institutional factors for countries. The model allows $\alpha_{i}$ to be correlated with the regressor matrix, $x_{it}.$ |
− | *$\alpha_{i}$ is not observable and cannot be directly controlled for. The fixed effect model eliminates $\alpha_{i}$ by demeaning the variables using the within transformation: $y_{it} - \bar{y_{i}} = (X_{it} - \bar{X_{i}})\beta + (\alpha_{i} - \bar{\alpha_{i}}) + ( | + | *$\alpha_{i}$ is not observable and cannot be directly controlled for. The fixed effect model eliminates $\alpha_{i}$ by demeaning the variables using the within transformation: $y_{it} - \bar{y_{i}} = (X_{it} - \bar{X_{i}})\beta + (\alpha_{i} - \bar{\alpha_{i}}) + (\mu_{it} - \bar{\mu_{it}}) \Rightarrow \ddot{y_{it}}=\ddot{X_{it}}\beta+\ddot{\mu_{it}}$, where $\bar{X_{i}}=\frac{1}{T}\sum_{t=1}^{T}X_{it}$, and $\bar{\mu_{i}}=\frac{1}{T} \sum_{t=1}^{T}\mu_{it}$, since $\alpha_{i}$ is constant, $\bar{\alpha_{i}}=\alpha_{i}$, hence the effect is eliminated. The fixed effect estimator $\hat{\beta_{FE}}$ is then obtained by an OLS regression of $\ddot{y}$ on $\ddot{X}.$ |
− | *Equality of Fixed Effects (FE) and First Differences (FD) estimator with the special case of T=2. That is the FE estimator effectively doubles the data set used in the FD estimator. $FE_{T=2}=[(x_{i1}-\bar{x_{i}})(x_{i1}-\bar{x_{i}})'+(x_{i2}-\bar{x_{i}})(x_{i2}-\bar{x_{i}})']^{-1}[(x_{i1}-\bar{x_{i}})(y_{i1}-\bar{y_{i}})+(x_{i2}-\bar{x_{i}})(y_{i2}-\bar{y_{i}})],$ since each $(x_{i1}-\bar{x_{i}})$ can be rewritten as $ | + | *Equality of Fixed Effects (FE) and First Differences (FD) estimator with the special case of T=2. That is the FE estimator effectively doubles the data set used in the FD estimator. $FE_{T=2}=[(x_{i1}-\bar{x_{i}})(x_{i1}-\bar{x_{i}})'+(x_{i2}-\bar{x_{i}})(x_{i2}-\bar{x_{i}})']^{-1}[(x_{i1}-\bar{x_{i}})(y_{i1}-\bar{y_{i}})+(x_{i2}-\bar{x_{i}})(y_{i2}-\bar{y_{i}})],$ since each $(x_{i1}-\bar{x_{i}})$ can be rewritten as $\frac{x_{i1}-x_{i2}}{2},$ |
− | so$FE_{T=2}=[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{x_{i1}-x_{i2}'}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{y_{i1}-y_{i2}}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}]=[\sum_{i=1}^{N}2\frac{x_{i1}-x_{i2}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}2\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}] | + | so $FE_{T=2}=[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{x_{i1}-x_{i2}'}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{y_{i1}-y_{i2}}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}]$ |
− | =[\sum_{i=1}^{N}(x_{i2}-x_{i1}x_{i2}-x_{i1}']^{-1}\sum_{i=1}^{N}(x_{i2}-x_{i1})(y_{i2}-y_{i1})=FD_{T=2}.$ | + | |
− | *Hausman-Taylor method: need more than one time-variant regressor $(X)$ and time invariant regressor $(Z)$ and at least one $X$ and one $Z$ that are uncorrelated with $\alpha_{i}$, partition the $X$ and $Z$ variables such that $X=[X_{1it}\vdots X_{2it}], Z=[Z_{1it}\vdots Z_{2it}]$ where $X_{1}$ and $Z_{1}$ are uncorrelated with $\alpha_{i}.$ Need $K1 > G2.$ Estimating $\gamma via OLS on \hat{di} = Z_{i}\gamma + \varphi_{it},$ using $X_{1}$ and $Z_{1}$ as instruments yields a consistent estimate. | + | $=[\sum_{i=1}^{N}2\frac{x_{i1}-x_{i2}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}2\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}]$ |
+ | |||
+ | $=[\sum_{i=1}^{N}(x_{i2}-x_{i1}x_{i2}-x_{i1}']^{-1}\sum_{i=1}^{N}(x_{i2}-x_{i1})(y_{i2}-y_{i1})=FD_{T=2}.$ | ||
+ | |||
+ | *Hausman-Taylor method: need more than one time-variant regressor $(X)$ and time invariant regressor $(Z)$ and at least one $X$ and one $Z$ that are uncorrelated with $\alpha_{i}$, partition the $X$ and $Z$ variables such that $X=[X_{1it}\vdots X_{2it}], Z=[Z_{1it}\vdots Z_{2it}]$ where $X_{1}$ and $Z_{1}$ are uncorrelated with $\alpha_{i}.$ Need $K1 > G2.$ Estimating $\gamma$ via OLS on $\hat{di} = Z_{i}\gamma + \varphi_{it},$ using $X_{1}$ and $Z_{1}$ as instruments yields a consistent estimate. | ||
*Testing FE vs. RE with a Hausman test: $H_{0}: \alpha_{i}$ independent from $X_{it}$, $Z_{it}$ vs. $H_{a}: \alpha_{i}$ not independent $X_{it}$, $Z_{it}$. If $H_{0}$ is true, both $\hat{\beta_{RE}}$ and $\hat{\beta_{FE}}$ are consistent, but only $\hat{\beta_{RE}}$ is efficient. If $H_{a}$ is true, $\hat{\beta_{FE}}$ is consistent and $\hat{\beta_{RE}}$ not. $\hat{Q}$ = $\hat{\beta_{RE}}-\hat{\beta_{FE}}$, $\hat{HT}= T\hat{Q’}[Var(\hat{\beta_{FE}})-var(\hat{\beta_{RE}})]^{-1}\hat{Q} \sim \chi_{k}^2,$ where $K = dim(Q)$. The Hausman test is a specification test so a large test statistic might be indication that there might be errors in variables or the model is misspecified. If the fixed effect assumption is true, $\hat{\beta_{LD}} \approx \hat{\beta_{FD}} \approx \hat{\beta_{FE}}.$ | *Testing FE vs. RE with a Hausman test: $H_{0}: \alpha_{i}$ independent from $X_{it}$, $Z_{it}$ vs. $H_{a}: \alpha_{i}$ not independent $X_{it}$, $Z_{it}$. If $H_{0}$ is true, both $\hat{\beta_{RE}}$ and $\hat{\beta_{FE}}$ are consistent, but only $\hat{\beta_{RE}}$ is efficient. If $H_{a}$ is true, $\hat{\beta_{FE}}$ is consistent and $\hat{\beta_{RE}}$ not. $\hat{Q}$ = $\hat{\beta_{RE}}-\hat{\beta_{FE}}$, $\hat{HT}= T\hat{Q’}[Var(\hat{\beta_{FE}})-var(\hat{\beta_{RE}})]^{-1}\hat{Q} \sim \chi_{k}^2,$ where $K = dim(Q)$. The Hausman test is a specification test so a large test statistic might be indication that there might be errors in variables or the model is misspecified. If the fixed effect assumption is true, $\hat{\beta_{LD}} \approx \hat{\beta_{FD}} \approx \hat{\beta_{FE}}.$ | ||
*Steps for fixed effects model: | *Steps for fixed effects model: | ||
**Calculate group and grand means, | **Calculate group and grand means, | ||
− | **Calculate k=number of groups, n=number of observations per group, N=total number of observations (k | + | **Calculate $k$=number of groups, $n$=number of observations per group, $N$=total number of observations $(k * n)$, |
− | **Calculate SS-total (or total variance) as: $(Each\ | + | **Calculate SS-total (or total variance) as: $(\text{ Each } \text{ score } - \text{ Grand } \text{ mean })^2$ then summed, |
− | **Calculate SS-treat (or treatment effect) as: $(Each\ | + | **Calculate SS-treat (or treatment effect) as: $(\text{ Each } \text{ group } \text{ mean }- \text{ Grand } \text{ mean })^2$ then summed $x + n$, |
− | **Calculate SS-error (or error effect) as $(Each\ | + | **Calculate SS-error (or error effect) as $(\text { Each } \text{ score } - \text { Its } \text{ group }\text{ mean })^2$ then summed, |
**Calculate df-total: N-1, df-treat: k-1 and df-error k(n-1), | **Calculate df-total: N-1, df-treat: k-1 and df-error k(n-1), | ||
**Calculate Mean Square MS-treat: SS-treat/df-treat, then MS-error: SS-error/df-error, | **Calculate Mean Square MS-treat: SS-treat/df-treat, then MS-error: SS-error/df-error, | ||
Line 33: | Line 37: | ||
**Conclude as to whether treatment effect significantly affects the variable of interest. | **Conclude as to whether treatment effect significantly affects the variable of interest. | ||
− | '''Random effects model:'''a statistical model in which the dataset being analyzed consist of a hierarchy of different population whose differences relate to that hierarchy. It is in contrast of the fixed effects model where the data being studied on consist only of non-random variables. | + | '''2) Random effects model:'''a statistical model in which the dataset being analyzed consist of a hierarchy of different population whose differences relate to that hierarchy. It is in contrast of the fixed effects model where the data being studied on consist only of non-random variables. |
*The random effects model can be applied to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed from the data through differencing. | *The random effects model can be applied to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed from the data through differencing. | ||
*The random effect assumption: individual specific effects are uncorrelated with the independent variables. The random effects model is efficient and consistent only when this assumption is met. | *The random effect assumption: individual specific effects are uncorrelated with the independent variables. The random effects model is efficient and consistent only when this assumption is met. | ||
− | *Suppose m schools are chosen randomly from the united states to study on the math score of the 7th grade students. Suppose that $n$ students of the same age are randomly chosen from each selected school and their math scores are recorded. Let $Y_{ij}$ be the score of the $j^{th}$ student at the $i^{th} school. A simple model can be fitted $Y_{ij} = \mu + | + | *Suppose $m$ schools are chosen randomly from the united states to study on the math score of the 7th grade students. Suppose that $n$ students of the same age are randomly chosen from each selected school and their math scores are recorded. Let $Y_{ij}$ be the score of the $j^{th}$ student at the $i^{th}$ school. A simple model can be fitted $Y_{ij} = \mu + \mu_{i} + W_{ij}$, where $\mu$ is the average test score for the entire population. In this model, $\mu_{i}$ is the school-specific random effect, which measures the difference between the average math score at school $i$ and the average math score in the entire country. It is considered to be random because the school are randomly chosen form the contry. $W_{ij}$ is the individual-specific error. This is also random since the students within the school are random chosen. We can augment the model further by adding in additional explanatory variables, say teacher to student ratio. |
− | *Variance components: the variance of $Y_{ij}$ is the sum of the variance $\tau^2$ and $\sigma^2 of U_{i}$ and $W_{ij}$ respectively. Let $\bar{Y_{i\cdot}}=1 | + | *Variance components: the variance of $Y_{ij}$ is the sum of the variance $\tau^2$ and $\sigma^2$ of $U_{i}$ and $W_{ij}$ respectively. Let $\bar{Y_{i\cdot}}=\frac{1}{n} \sum_{j=1}^{n} Y_{ij}$ be the average of the $i^{th}$ school that are included in the random sample. Let $\bar{Y_{\cdot\cdot}} = \frac{1}{mn}\sum_{i=1}^{m}\sum_{j=1}{n}Y_{ij}$ be the grand average. Let $SSW = \sum_{i=1}^{m}\sum_{j=1}^{n}(Y_{ij}-\bar{Y_{i\cdot}})^2, SSB = n\sum_{i=1}^{m}(\bar{Y_{i\cdot}}-\bar{Y_{\cdot\cdot}})^2 $ are the sum of square due to differences within groups and the sum of squares due to difference between groups respectively. $\frac{1}{m(n-1)}E(SSW)=\sigma^2, \frac{1}{(m-1)n}E(SSB)=\frac{\sigma^2}{n+\tau^2}.$ This can be used to estimate the variance components $\sigma^2$ and $\tau^2$. |
*Comments on random effects model: in general the random effects model are efficient and should be used if the assumptions underlying are satisfied. For random effect to work in the school examples, the school-specific effects should be uncorrelated to other covariates of the model. A Hausman specification test can be used to test on this assumption as described in the previous Fixed Effects Model. | *Comments on random effects model: in general the random effects model are efficient and should be used if the assumptions underlying are satisfied. For random effect to work in the school examples, the school-specific effects should be uncorrelated to other covariates of the model. A Hausman specification test can be used to test on this assumption as described in the previous Fixed Effects Model. | ||
− | + | '''3) Mixed models:''' a statistical model containing both fixed and random effects, which is widely used in varieties of disciplines, particularly useful for situations where repeated measurements are made on the same statistical units (longitudinal study), or where measurements are made on clusters of related statistical units. | |
− | *Mixed model formula (in matrix form): $y = X\beta + Zu+ \epsilon$, where $y$ is a vector of observations with mean $E(y) = X\beta$, $\beta$ is a vector of fixed effects, $ | + | *Mixed model formula (in matrix form): $y = X\beta + Zu+ \epsilon$, where $y$ is a vector of observations with mean $E(y) = X\beta$, $\beta$ is a vector of fixed effects, $\mu$ is a vector of random effects with mean $E(\mu)=0$ and variance-covariance matrix $var(\mu)=G$, $\epsilon$ is a vector of i.i.d. random error terms with mean $E(\epsilon)=0$ and variance $var(\epsilon)=R,$ $X$ and $Z$ are matrices of regressors relating the observations $y$ to $\beta$ and $\mu$, respectively. |
− | *Estimation of Henderson’s mixed model equations (MME): $\begin{pmatrix} X'R^{-1}Z&X'R^{-1}Z\\ Z'R^{-1}X&Z'R^{-1}Z+G^{-1} \end{pmatrix} \bigl(\begin{smallmatrix} X'R^{-1}y\\Z'R^{-1}y \end{smallmatrix}\bigr).$ The solutions to the MME, $\tilde{\beta}$ and $\tilde{ | + | *Estimation of Henderson’s mixed model equations (MME): $\begin{pmatrix} X'R^{-1}Z&X'R^{-1}Z\\ Z'R^{-1}X&Z'R^{-1}Z+G^{-1} \end{pmatrix} \bigl(\begin{smallmatrix} X'R^{-1}y\\Z'R^{-1}y \end{smallmatrix}\bigr).$ The solutions to the MME, $\tilde{\beta}$ and $\tilde{ |
+ | \mu}$ are the best linear unbiased estimates (BLUE) and predictors for $\beta$ and $\mu$, respectively. When the conditional variance is known, the inverse variance weighed least squares estimate is BLUE, however, the conditional variance is rarely known. So it is desirable to jointly estimate the variance and weighted parameter estimates when solving MMEs. One method to fit such mixed models is that of EM algorithm, which can be implemented by lme in the nlme library in the R package. | ||
− | + | '''4) Comparing fixed effects model and random effects model:''' | |
*Estimating the summary effect: under the fixed effect model, it is assumed that the true effect size for all studies is identical and the only reason the effect size varies between studies is sampling error. In situations like this, assigning equal weights to different studies largely ignore the information in smaller studies since more information are known about the same effect size in larger studies. In contrast, the random effect model aims to estimate the mean of a distribution of effects. The estimate provided by that study may be imprecise, but it is information about an effect that no other study has estimated. We cannot discount a small study by giving it a small weight, not can we give too much weight to a large study. Hence, in fixed-effect model, there is a wide range of weights while the weights in the random effects model fall in a relatively narrow range. Study weights are more balanced under the random effect model than the fixed effect model. | *Estimating the summary effect: under the fixed effect model, it is assumed that the true effect size for all studies is identical and the only reason the effect size varies between studies is sampling error. In situations like this, assigning equal weights to different studies largely ignore the information in smaller studies since more information are known about the same effect size in larger studies. In contrast, the random effect model aims to estimate the mean of a distribution of effects. The estimate provided by that study may be imprecise, but it is information about an effect that no other study has estimated. We cannot discount a small study by giving it a small weight, not can we give too much weight to a large study. Hence, in fixed-effect model, there is a wide range of weights while the weights in the random effects model fall in a relatively narrow range. Study weights are more balanced under the random effect model than the fixed effect model. | ||
*A fixed effect meta analysis estimates a single effect that is assumed to be common to every study while a random effects meta analysis estimates the mean of a distribution of effects. | *A fixed effect meta analysis estimates a single effect that is assumed to be common to every study while a random effects meta analysis estimates the mean of a distribution of effects. | ||
Line 54: | Line 59: | ||
===Applications=== | ===Applications=== | ||
− | 1) [http://www.tandfonline.com/doi/abs/10.1080/01621459.1984.10477102#.U-2PdhZTWdA This article ] proposed approximations for standard errors of estimators of fixed, and random effects in mixed linear models. Best linear unbiased estimators of the fixed and random effects of mixed linear models are available when the true values of the variance ratios are known. If the true values are replaced by estimated values, the mean squared errors of the estimators of the fixed and random effects increase in size. The magnitude of this increase is investigated, and a general approximation is proposed. The performance of this approximation is investigated in the context of (a) the estimation of the effects of the balanced one-way random model and (b) the estimation of treatment contrasts for balanced incomplete block designs. | + | 1) [http://www.tandfonline.com/doi/abs/10.1080/01621459.1984.10477102#.U-2PdhZTWdA This article] proposed approximations for standard errors of estimators of fixed, and random effects in mixed linear models. Best linear unbiased estimators of the fixed and random effects of mixed linear models are available when the true values of the variance ratios are known. If the true values are replaced by estimated values, the mean squared errors of the estimators of the fixed and random effects increase in size. The magnitude of this increase is investigated, and a general approximation is proposed. The performance of this approximation is investigated in the context of (a) the estimation of the effects of the balanced one-way random model and (b) the estimation of treatment contrasts for balanced incomplete block designs. |
− | 2) [http://amstat.tandfonline.com/doi/abs/10.1080/01621459.1955.10501298#.U-2P1xZTWdA This article ] discussed about the fixed, mixed and random models. Some explicit questions are raised regarding the adequacy of assumed linear models as a basis for the interpretation of the analysis of variance of randomized experiments. A generally applicable method for the derivation of a linear statistical model, based on the experimental situation and the design of the experiment, is exemplified. The central features of the method are the notion of “experimental unit,” the concept of “true response,” and the use of randomization in the design. A model is derived for the case where two factors having A and B levels respectively, are to be examined with respect to a population of P experimental units, where selection of levels of the factors to be tested, selection of experimental units to be used, and the allocation of selected treatment combinations to units is at random. First a linear population model is given, based on the structure of the experimental situation, whose components are (unknown) parameters of the population of (conceptual) “true” responses. Then the conditions of the design are imposed to obtain a linear statistical model whose components involve the parameters of the population model and some defined random variables reflecting the experimental procedure and design. The derived statistical model is then used to obtain expected mean squares in the analyses of variance. A second illustration of the general methodology is given for a more complex example originated by Vaurio and Daniel and discussed statistically by Scheffé. Some differences from Scheffé's results are noted. | + | 2) [http://amstat.tandfonline.com/doi/abs/10.1080/01621459.1955.10501298#.U-2P1xZTWdA This article] discussed about the fixed, mixed and random models. Some explicit questions are raised regarding the adequacy of assumed linear models as a basis for the interpretation of the analysis of variance of randomized experiments. A generally applicable method for the derivation of a linear statistical model, based on the experimental situation and the design of the experiment, is exemplified. The central features of the method are the notion of “experimental unit,” the concept of “true response,” and the use of randomization in the design. A model is derived for the case where two factors having A and B levels respectively, are to be examined with respect to a population of P experimental units, where selection of levels of the factors to be tested, selection of experimental units to be used, and the allocation of selected treatment combinations to units is at random. First a linear population model is given, based on the structure of the experimental situation, whose components are (unknown) parameters of the population of (conceptual) “true” responses. Then the conditions of the design are imposed to obtain a linear statistical model whose components involve the parameters of the population model and some defined random variables reflecting the experimental procedure and design. The derived statistical model is then used to obtain expected mean squares in the analyses of variance. A second illustration of the general methodology is given for a more complex example originated by Vaurio and Daniel and discussed statistically by Scheffé. Some differences from Scheffé's results are noted. |
===Software=== | ===Software=== | ||
− | [http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf | + | [http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf] |
− | lme (linear mixed effects) function in the nlme library | + | lme (linear mixed effects) function in the nlme library |
RCODE: | RCODE: | ||
Line 66: | Line 71: | ||
library(lattice) ## for Trellis graphics | library(lattice) ## for Trellis graphics | ||
data(MathAchieve) | data(MathAchieve) | ||
− | MathAchieve[1:10,] # first 10 students; relates to 7185 students; SES: | + | MathAchieve[1:10,] # first 10 students; relates to 7185 students; SES:the socioeconomic status |
− | + | of the student’s family, centered to the overall mean of 0; MathAch: the student’s score on a | |
− | + | math-achievement test; Sector: factor coded ‘Catholic’or ‘Public’; MEANSES: mean SES for student | |
− | + | in each school. | |
+ | |||
Grouped Data: MathAch ~ SES | School | Grouped Data: MathAch ~ SES | School | ||
School Minority Sex SES MathAch MEANSES | School Minority Sex SES MathAch MEANSES | ||
Line 103: | Line 109: | ||
mses[as.character(MathAchSchool$School[1:10])] # for first 10 schools | mses[as.character(MathAchSchool$School[1:10])] # for first 10 schools | ||
− | + | ||
Check the R example in the attached file at http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf . | Check the R example in the attached file at http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf . | ||
Line 118: | Line 124: | ||
===References=== | ===References=== | ||
− | http://mirlyn.lib.umich.edu/Record/004199238 | + | [http://mirlyn.lib.umich.edu/Record/004199238 Statistical inference / George Casella, Roger L. Berger] |
− | http://mirlyn.lib.umich.edu/Record/004232056 | + | |
− | + | [http://mirlyn.lib.umich.edu/Record/004232056 Sampling / Steven K. Thompson] | |
+ | [http://mirlyn.lib.umich.edu/Record/004133572 Sampling theory and methods / S. Sampath] | ||
Latest revision as of 10:52, 16 October 2014
Contents
Scientific Methods for Health Sciences - Fixed, Randomized and Mixed Effect Models
Overview:
Fixed model is a statistical model that represents the observed quantities in terms of explanatory variables treated as if they are non-random, while random model assumes that the dataset being analyzed consist of a hierarchy of different population whose differences relate to that hierarchy and mixed model consists of both fixed effects and random effects. For random effects model and mixed models, either all or part of the explanatory variables are treated as if they rise from random causes. In this section, we are presenting the theory and practice of these three types of models.
Motivation:
We should pay attention to the types of variables we are going to include in the model in order to choose the right model to fit our data. Consider the case where all the explanatory variables are treated as non-random, a fixed-effect model would be the choice, similarly, if the explanatory variables are considered to arise from random causes. Of course, mixed model enable us to deal with both random and fixed effects as a mixture in the study. The question would be how do these model work and differentiate from each other? How do we implement those models in our study?
Theory
1) Fixed effects model: a statistical model that represents the observed quantities in terms of explanatory variables treated as if they are non-random. The fixed effects estimator is used to refer to an estimator for the coefficients in the regression model and we impose time independent effects for each entity when we assume fixed effects for our model.
- The fixed effect model can be used to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed by differencing.
- The fixed effect assumption: the individual specific effect is correlated with the independent variables. This is the priority criteria in using the fixed effect model.
- The fixed effect model: consider the linear unobserved effects model for $N$ observations and $T$ time periods: $y_{it}=X_{it}\beta + \alpha_{i} + \mu_{it}, for t = 1, 2, \cdots, T$ and $i = 1, 2, \cdots, N$, where $y_{it}$ is the dependent variable observed for individual $i$ and time $t$, $X_{it}$ is the time variant $1xk$ regressor matrix, $\alpha_{i}$ is the unobserved time invariant individual effect and $u_{it}$ is the error term. $\alpha_{i}$ can’t be observed by the econometrician, common examples for time invariant effects $\alpha_{i}$ are innate ability for individuals or institutional factors for countries. The model allows $\alpha_{i}$ to be correlated with the regressor matrix, $x_{it}.$
- $\alpha_{i}$ is not observable and cannot be directly controlled for. The fixed effect model eliminates $\alpha_{i}$ by demeaning the variables using the within transformation: $y_{it} - \bar{y_{i}} = (X_{it} - \bar{X_{i}})\beta + (\alpha_{i} - \bar{\alpha_{i}}) + (\mu_{it} - \bar{\mu_{it}}) \Rightarrow \ddot{y_{it}}=\ddot{X_{it}}\beta+\ddot{\mu_{it}}$, where $\bar{X_{i}}=\frac{1}{T}\sum_{t=1}^{T}X_{it}$, and $\bar{\mu_{i}}=\frac{1}{T} \sum_{t=1}^{T}\mu_{it}$, since $\alpha_{i}$ is constant, $\bar{\alpha_{i}}=\alpha_{i}$, hence the effect is eliminated. The fixed effect estimator $\hat{\beta_{FE}}$ is then obtained by an OLS regression of $\ddot{y}$ on $\ddot{X}.$
- Equality of Fixed Effects (FE) and First Differences (FD) estimator with the special case of T=2. That is the FE estimator effectively doubles the data set used in the FD estimator. $FE_{T=2}=[(x_{i1}-\bar{x_{i}})(x_{i1}-\bar{x_{i}})'+(x_{i2}-\bar{x_{i}})(x_{i2}-\bar{x_{i}})']^{-1}[(x_{i1}-\bar{x_{i}})(y_{i1}-\bar{y_{i}})+(x_{i2}-\bar{x_{i}})(y_{i2}-\bar{y_{i}})],$ since each $(x_{i1}-\bar{x_{i}})$ can be rewritten as $\frac{x_{i1}-x_{i2}}{2},$
so $FE_{T=2}=[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{x_{i1}-x_{i2}'}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}\frac{x_{i1}-x_{i2}}{2}\frac{y_{i1}-y_{i2}}{2}+\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}]$
$=[\sum_{i=1}^{N}2\frac{x_{i1}-x_{i2}}{2}\frac{x_{i2}-x_{i1}'}{2}]^{-1}[\sum_{i=1}^{N}2\frac{x_{i2}-x_{i1}}{2}\frac{y_{i2}-y_{i1}}{2}]$
$=[\sum_{i=1}^{N}(x_{i2}-x_{i1}x_{i2}-x_{i1}']^{-1}\sum_{i=1}^{N}(x_{i2}-x_{i1})(y_{i2}-y_{i1})=FD_{T=2}.$
- Hausman-Taylor method: need more than one time-variant regressor $(X)$ and time invariant regressor $(Z)$ and at least one $X$ and one $Z$ that are uncorrelated with $\alpha_{i}$, partition the $X$ and $Z$ variables such that $X=[X_{1it}\vdots X_{2it}], Z=[Z_{1it}\vdots Z_{2it}]$ where $X_{1}$ and $Z_{1}$ are uncorrelated with $\alpha_{i}.$ Need $K1 > G2.$ Estimating $\gamma$ via OLS on $\hat{di} = Z_{i}\gamma + \varphi_{it},$ using $X_{1}$ and $Z_{1}$ as instruments yields a consistent estimate.
- Testing FE vs. RE with a Hausman test: $H_{0}: \alpha_{i}$ independent from $X_{it}$, $Z_{it}$ vs. $H_{a}: \alpha_{i}$ not independent $X_{it}$, $Z_{it}$. If $H_{0}$ is true, both $\hat{\beta_{RE}}$ and $\hat{\beta_{FE}}$ are consistent, but only $\hat{\beta_{RE}}$ is efficient. If $H_{a}$ is true, $\hat{\beta_{FE}}$ is consistent and $\hat{\beta_{RE}}$ not. $\hat{Q}$ = $\hat{\beta_{RE}}-\hat{\beta_{FE}}$, $\hat{HT}= T\hat{Q’}[Var(\hat{\beta_{FE}})-var(\hat{\beta_{RE}})]^{-1}\hat{Q} \sim \chi_{k}^2,$ where $K = dim(Q)$. The Hausman test is a specification test so a large test statistic might be indication that there might be errors in variables or the model is misspecified. If the fixed effect assumption is true, $\hat{\beta_{LD}} \approx \hat{\beta_{FD}} \approx \hat{\beta_{FE}}.$
- Steps for fixed effects model:
- Calculate group and grand means,
- Calculate $k$=number of groups, $n$=number of observations per group, $N$=total number of observations $(k * n)$,
- Calculate SS-total (or total variance) as: $(\text{ Each } \text{ score } - \text{ Grand } \text{ mean })^2$ then summed,
- Calculate SS-treat (or treatment effect) as: $(\text{ Each } \text{ group } \text{ mean }- \text{ Grand } \text{ mean })^2$ then summed $x + n$,
- Calculate SS-error (or error effect) as $(\text { Each } \text{ score } - \text { Its } \text{ group }\text{ mean })^2$ then summed,
- Calculate df-total: N-1, df-treat: k-1 and df-error k(n-1),
- Calculate Mean Square MS-treat: SS-treat/df-treat, then MS-error: SS-error/df-error,
- Calculate obtained f value: MS-treat/MS-error,
- Use F-table or probability function, to look up critical f value with a certain significance level,
- Conclude as to whether treatment effect significantly affects the variable of interest.
2) Random effects model:a statistical model in which the dataset being analyzed consist of a hierarchy of different population whose differences relate to that hierarchy. It is in contrast of the fixed effects model where the data being studied on consist only of non-random variables.
- The random effects model can be applied to control for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables. The constant can be removed from the data through differencing.
- The random effect assumption: individual specific effects are uncorrelated with the independent variables. The random effects model is efficient and consistent only when this assumption is met.
- Suppose $m$ schools are chosen randomly from the united states to study on the math score of the 7th grade students. Suppose that $n$ students of the same age are randomly chosen from each selected school and their math scores are recorded. Let $Y_{ij}$ be the score of the $j^{th}$ student at the $i^{th}$ school. A simple model can be fitted $Y_{ij} = \mu + \mu_{i} + W_{ij}$, where $\mu$ is the average test score for the entire population. In this model, $\mu_{i}$ is the school-specific random effect, which measures the difference between the average math score at school $i$ and the average math score in the entire country. It is considered to be random because the school are randomly chosen form the contry. $W_{ij}$ is the individual-specific error. This is also random since the students within the school are random chosen. We can augment the model further by adding in additional explanatory variables, say teacher to student ratio.
- Variance components: the variance of $Y_{ij}$ is the sum of the variance $\tau^2$ and $\sigma^2$ of $U_{i}$ and $W_{ij}$ respectively. Let $\bar{Y_{i\cdot}}=\frac{1}{n} \sum_{j=1}^{n} Y_{ij}$ be the average of the $i^{th}$ school that are included in the random sample. Let $\bar{Y_{\cdot\cdot}} = \frac{1}{mn}\sum_{i=1}^{m}\sum_{j=1}{n}Y_{ij}$ be the grand average. Let $SSW = \sum_{i=1}^{m}\sum_{j=1}^{n}(Y_{ij}-\bar{Y_{i\cdot}})^2, SSB = n\sum_{i=1}^{m}(\bar{Y_{i\cdot}}-\bar{Y_{\cdot\cdot}})^2 $ are the sum of square due to differences within groups and the sum of squares due to difference between groups respectively. $\frac{1}{m(n-1)}E(SSW)=\sigma^2, \frac{1}{(m-1)n}E(SSB)=\frac{\sigma^2}{n+\tau^2}.$ This can be used to estimate the variance components $\sigma^2$ and $\tau^2$.
- Comments on random effects model: in general the random effects model are efficient and should be used if the assumptions underlying are satisfied. For random effect to work in the school examples, the school-specific effects should be uncorrelated to other covariates of the model. A Hausman specification test can be used to test on this assumption as described in the previous Fixed Effects Model.
3) Mixed models: a statistical model containing both fixed and random effects, which is widely used in varieties of disciplines, particularly useful for situations where repeated measurements are made on the same statistical units (longitudinal study), or where measurements are made on clusters of related statistical units.
- Mixed model formula (in matrix form): $y = X\beta + Zu+ \epsilon$, where $y$ is a vector of observations with mean $E(y) = X\beta$, $\beta$ is a vector of fixed effects, $\mu$ is a vector of random effects with mean $E(\mu)=0$ and variance-covariance matrix $var(\mu)=G$, $\epsilon$ is a vector of i.i.d. random error terms with mean $E(\epsilon)=0$ and variance $var(\epsilon)=R,$ $X$ and $Z$ are matrices of regressors relating the observations $y$ to $\beta$ and $\mu$, respectively.
- Estimation of Henderson’s mixed model equations (MME): $\begin{pmatrix} X'R^{-1}Z&X'R^{-1}Z\\ Z'R^{-1}X&Z'R^{-1}Z+G^{-1} \end{pmatrix} \bigl(\begin{smallmatrix} X'R^{-1}y\\Z'R^{-1}y \end{smallmatrix}\bigr).$ The solutions to the MME, $\tilde{\beta}$ and $\tilde{
\mu}$ are the best linear unbiased estimates (BLUE) and predictors for $\beta$ and $\mu$, respectively. When the conditional variance is known, the inverse variance weighed least squares estimate is BLUE, however, the conditional variance is rarely known. So it is desirable to jointly estimate the variance and weighted parameter estimates when solving MMEs. One method to fit such mixed models is that of EM algorithm, which can be implemented by lme in the nlme library in the R package.
4) Comparing fixed effects model and random effects model:
- Estimating the summary effect: under the fixed effect model, it is assumed that the true effect size for all studies is identical and the only reason the effect size varies between studies is sampling error. In situations like this, assigning equal weights to different studies largely ignore the information in smaller studies since more information are known about the same effect size in larger studies. In contrast, the random effect model aims to estimate the mean of a distribution of effects. The estimate provided by that study may be imprecise, but it is information about an effect that no other study has estimated. We cannot discount a small study by giving it a small weight, not can we give too much weight to a large study. Hence, in fixed-effect model, there is a wide range of weights while the weights in the random effects model fall in a relatively narrow range. Study weights are more balanced under the random effect model than the fixed effect model.
- A fixed effect meta analysis estimates a single effect that is assumed to be common to every study while a random effects meta analysis estimates the mean of a distribution of effects.
- Moving from fixed effect to random effects, extreme studies will lose influence if they are large and will gain influence if they are small.
- Confidence interval: for the fixed effect model, the only source of uncertainty is the within study error while the random effects model has an additional source of between studies variance. The confidence intervals for the summary effect are wider under the random effect model than under the fixed effect model.
- The hypothesis tested: with the fixed effect model, the null hypothesis tested is that there is zero effect in every study while that in the random effect model is that the mean effect is zero.
- The selection of the right model is based on our expectation about whether or not the study shares a common effect size and on our goals in performing the analysis. We use fixed effect model if the following two conditions are satisfied: (1) all the studies included in the analysis are functionally identical; (2) the goal to compute the common effect size for the identified population and not to generalized to other population. By contrast, when we are accumulating data from a series of studies performed operating independently, it would be unlikely that all the studies are functionally equivalent. They differ in ways that would impact the result, and we cannot assume a common effect size. The random effect model would be easily justified than fixed effect model for cases like this. The selection of a model must be based solely on the question of which model fits the distribution of effect sizes, and takes account of the relevant sources of error.
- For the fixed effect model, we assume that all dispersion in observed effects is due to sampling error but with random effects model we allow some of that dispersion reflects real difference in effect size across studies.
Applications
1) This article proposed approximations for standard errors of estimators of fixed, and random effects in mixed linear models. Best linear unbiased estimators of the fixed and random effects of mixed linear models are available when the true values of the variance ratios are known. If the true values are replaced by estimated values, the mean squared errors of the estimators of the fixed and random effects increase in size. The magnitude of this increase is investigated, and a general approximation is proposed. The performance of this approximation is investigated in the context of (a) the estimation of the effects of the balanced one-way random model and (b) the estimation of treatment contrasts for balanced incomplete block designs.
2) This article discussed about the fixed, mixed and random models. Some explicit questions are raised regarding the adequacy of assumed linear models as a basis for the interpretation of the analysis of variance of randomized experiments. A generally applicable method for the derivation of a linear statistical model, based on the experimental situation and the design of the experiment, is exemplified. The central features of the method are the notion of “experimental unit,” the concept of “true response,” and the use of randomization in the design. A model is derived for the case where two factors having A and B levels respectively, are to be examined with respect to a population of P experimental units, where selection of levels of the factors to be tested, selection of experimental units to be used, and the allocation of selected treatment combinations to units is at random. First a linear population model is given, based on the structure of the experimental situation, whose components are (unknown) parameters of the population of (conceptual) “true” responses. Then the conditions of the design are imposed to obtain a linear statistical model whose components involve the parameters of the population model and some defined random variables reflecting the experimental procedure and design. The derived statistical model is then used to obtain expected mean squares in the analyses of variance. A second illustration of the general methodology is given for a more complex example originated by Vaurio and Daniel and discussed statistically by Scheffé. Some differences from Scheffé's results are noted.
Software
[1] lme (linear mixed effects) function in the nlme library
RCODE: library(nlme) library(lattice) ## for Trellis graphics data(MathAchieve) MathAchieve[1:10,] # first 10 students; relates to 7185 students; SES:the socioeconomic status of the student’s family, centered to the overall mean of 0; MathAch: the student’s score on a math-achievement test; Sector: factor coded ‘Catholic’or ‘Public’; MEANSES: mean SES for student in each school.
Grouped Data: MathAch ~ SES | School School Minority Sex SES MathAch MEANSES 1 1224 No Female -1.528 5.876 -0.428 2 1224 No Female -0.588 19.708 -0.428 3 1224 No Male -0.528 20.349 -0.428 4 1224 No Male -0.668 8.781 -0.428 5 1224 No Male -0.158 17.898 -0.428 6 1224 No Male 0.022 4.583 -0.428 7 1224 No Female -0.618 -2.832 -0.428 8 1224 No Male -0.998 0.523 -0.428 9 1224 No Female -0.888 1.527 -0.428 10 1224 No Male -0.458 21.521 -0.428
data(MathAchSchool) MathAchSchool[1:10,] # fist 10 schools; relates to 160 schools
School Size Sector PRACAD DISCLIM HIMINTY MEANSES 1224 1224 842 Public 0.35 1.597 0 -0.428 1288 1288 1855 Public 0.27 0.174 0 0.128 1296 1296 1719 Public 0.32 -0.137 1 -0.420 1308 1308 716 Catholic 0.96 -0.622 0 0.534 1317 1317 455 Catholic 0.95 -1.694 1 0.351 1358 1358 1430 Public 0.25 1.535 0 -0.014 1374 1374 2400 Public 0.50 2.016 0 -0.007 1433 1433 899 Catholic 0.96 -0.321 0 0.718 1436 1436 185 Catholic 1.00 -1.141 0 0.569 1461 1461 1672 Public 0.78 2.096 0 0.683
attach(MathAchieve) attach(MathAchSchool) mses <- tapply(SES, School, mean) ## school means mses[as.character(MathAchSchool$School[1:10])] # for first 10 schools
Check the R example in the attached file at http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-mixed-models.pdf .
Problems
I like the above example – but can we use one of the SOCR Datasets SOCR Data to demonstrate the same linear mixed effect modeling in a different setting?
For example, can we replicate the same analysis protocol using either Ozone Data (where location acts as a grouping variable, school in the previous example),
PRB Data or
World Countries Rankingsdatasets?
References
Statistical inference / George Casella, Roger L. Berger
Sampling theory and methods / S. Sampath
- SOCR Home page: http://www.socr.umich.edu
Translate this page: