Difference between revisions of "SMHS rANOVA"
(→Motivation) |
|||
Line 52: | Line 52: | ||
+ | with corresponding degrees of freedom: | ||
+ | $df_{Total}=df_{conditions}+df_{subjects}+df_{Error} | ||
+ | =(k-1)+(n-1)+((n-k)(n-1))$ | ||
+ | |||
+ | $SS_{Time}$s the same as for $SS_{b}$ in an independent ANOVA, and | ||
+ | |||
+ | |||
+ | $SS_{Time}=SS_{b}=\sum_{i=1}^{k} n_{i} (\bar x_{i}-\bar x)^{2}=$ | ||
+ | |||
+ | where $k$ is the number of conditions, $n_{i}$ is the number of subjects under $ith$ condition, $\bar x_{i}$ is the mean score for each $ith$ condition and $\bar x$ is the overall grand mean of all conditions. | ||
+ | |||
+ | |||
+ | Here, we have | ||
+ | |||
+ | $SS_{Time}=SS_{b}=\sum_{i=1}^{k} n_{i} (\bar x_{i}-\bar x)^{2}=6[(42.8-45.9)^{2}+(45.3-45.9)^{2}+(49.7-45.9)^{2}]=6[9.61+0.36+14.44]=143.44.$ | ||
+ | |||
+ | |||
+ | The within-subject variation $SS_{w}$ is calculated as | ||
+ | $SS_{W}=\sum_{1}(x_{i1}-\bar x_{1})^{2}+\sum_{2}(x_{i2}-\bar x_{2})^{2} +⋯+\sum_{k}(x_{ik}-\bar x_{k})^{2}$ | ||
+ | |||
+ | where $x_{ik}$ is the score of the $ith$ subject in group $k$. | ||
+ | |||
+ | For this example we have: | ||
+ | $SS_{W}=\sum_{1}(x_{i1}-\bar x_{1})^{2}+\sum_{2}(x_{i2}-\bar x_{2})^{2} +⋯+\sum_{k}(x_{ik}-\bar x_{k})^{2}=[(45-42.8)^2+(42-42.8)^2+⋯+(56-49.7)^2 ]=715.5$ | ||
+ | |||
+ | |||
+ | |||
+ | $SS_{subject}$ is calculated by: $SS_{subject}=k\sum(\bar x_{i}-\bar x )^{2}$ where $\bar x_{i}$ is the mean score of the $ith$ subject, $\bar x$ is the grand mean. | ||
+ | |||
+ | |||
+ | In our example, we have $SS_{subject}=k\sum(\bar x_{i})-\bar x^{2}=3[(50-45.9)^2+(40-45.9)^2+(38-45.9)^2+(55-45.9)^2+(49.7-45.9)^2 ]=658.3.$ | ||
+ | |||
+ | |||
+ | Thus, | ||
+ | |||
+ | $SS_{Error}= SS_{w}-SS_{subject}=715.5-658.3=57.2$ | ||
+ | |||
+ | $F=MS_{Time}⁄MS_{Error} =(SS_{Time}⁄df_{Time})⁄(SS_{Error}⁄df_{Error})=$ | ||
+ | |||
+ | $=(SS_{Time}⁄(k-1))⁄(SS_{Error}⁄(n-1)(k-1))=$ | ||
+ | |||
+ | $(146.44⁄((3-1))⁄(57.2⁄(5*2))=12.53$ | ||
+ | |||
+ | |||
+ | We can now look up (or use a computer program) to ascertain the critical F-statistic for our F-distribution with our degrees of freedom for time $df_{Time}$ and error $df_{Error}$ and determine whether our F-statistic indicates a statistically significant result. | ||
+ | We report the F-statistic from a repeated measures ANOVA as: | ||
+ | |||
+ | |||
+ | $F(df_{Time}, df_{Error} = F-value, p = p-value,$ | ||
+ | |||
+ | which for our example would be: | ||
+ | |||
+ | $F(2, 10) = 12.53, p = 0.002,$ | ||
+ | see [SOCR Java F-distribution calculator http://socr.ucla.edu/htmls/dist/Fisher_Distribution.html], or | ||
+ | [the Distributome HTML5 F-distribution Calculator www.distributome.org/V3/calc/FCalculator.html] | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
Revision as of 13:31, 10 October 2014
Scientific Methods for Health Sciences - Repeated measures Analysis of Variance (rANOVA)
Overview
The phrase Repeated measures is used in situations when the same objects/units/entities take part in all conditions of an experiment. Given there is multiple measures on the same subject, we have to control for correlation between multiple measures on the same subject. Repeated measures ANOVA (rANOVA) is a commonly used statistical approach for analyzing repeated measure designs. It is the equivalent of the one-way ANOVA, but for related, not independent, groups. It is also referred to as within-subject ANOVA or ANOVA for correlated samples. The test is to detect any overall differences between related means. We can analyze data using repeated measures ANOVA for two types of study design. Studies that investigate either (1) changes in mean scores over three or more time points, or (2) difference in mean scores under three or more different conditions.
Motivation
If you want to test the equality of means, ANOVA would often be a good way to go. However, when it comes to data with repeated measures, a standard ANOVA would be inappropriate because it fails to model the correlation between the repeated measures: the data violate the ANOVA assumption of independence. Repeated measures is used for several reasons: First, some research hypotheses require repeated measures. Longitudinal research, for example, measures each sample member at each of several ages. In this case, age would be a repeated factor. Second, in cases where there is a great deal of variation between sample members, error variance estimates from standard ANOVAs are large. Repeated measures of each sample member provide a way of accounting for this variance, thus reducing error variance. Third, when sample members are difficult to recruit, repeated measures designs are economical because each member is measured under all conditions. Fourth, it applies when members have been matched according to some important characteristic.
In order to provide a demonstration of how to calculate a repeated measures ANOVA, we shall use the example of a 6-month exercise-training intervention where six subjects had their fitness level measured on three occasions: pre-, 3 months, and post-intervention. Their data is shown below along with some initial calculations:
Subjects | Pre- | 3 Months | 6 Months | Subject Means |
1 | 45 | 50 | 55 | 50 |
2 | 42 | 42 | 45 | 43 |
3 | 36 | 41 | 43 | 40 |
4 | 39 | 35 | 40 | 38 |
5 | 51 | 55 | 59 | 55 |
6 | 44 | 49 | 56 | 49.7 |
Monthly Means | 42.8 | 45.3 | 49.7 |
Source | SS | df | MS | F |
Conditions | $SS_{conditions}$ | $(k-1)$ | $MS_{conditions}$ | $\frac {MS_{conditions}}{MS_{error}}$ |
Subjects | $SS_{subjects}$ | $(n-1)$ | $MS_{subjects}$ | $\frac{MS_{subjects}} {MS_{error}}$ |
Error | $SS_{error}$ | $(k-1)(n-1)$ | $MS_{error}$ | |
Total | $SS_{T}$ | $(N-1)$ |
$F=\frac {MS_{conditions}}{MS_{error}}=\frac {MS_{time}}{MS_{error}}$
$SS_{Total} = SS_{conditions}+SS_{subjects}+SS_{Error}$
with corresponding degrees of freedom:
$df_{Total}=df_{conditions}+df_{subjects}+df_{Error} =(k-1)+(n-1)+((n-k)(n-1))$
$SS_{Time}$s the same as for $SS_{b}$ in an independent ANOVA, and
$SS_{Time}=SS_{b}=\sum_{i=1}^{k} n_{i} (\bar x_{i}-\bar x)^{2}=$
where $k$ is the number of conditions, $n_{i}$ is the number of subjects under $ith$ condition, $\bar x_{i}$ is the mean score for each $ith$ condition and $\bar x$ is the overall grand mean of all conditions.
Here, we have
$SS_{Time}=SS_{b}=\sum_{i=1}^{k} n_{i} (\bar x_{i}-\bar x)^{2}=6[(42.8-45.9)^{2}+(45.3-45.9)^{2}+(49.7-45.9)^{2}]=6[9.61+0.36+14.44]=143.44.$
The within-subject variation $SS_{w}$ is calculated as
$SS_{W}=\sum_{1}(x_{i1}-\bar x_{1})^{2}+\sum_{2}(x_{i2}-\bar x_{2})^{2} +⋯+\sum_{k}(x_{ik}-\bar x_{k})^{2}$
where $x_{ik}$ is the score of the $ith$ subject in group $k$.
For this example we have: $SS_{W}=\sum_{1}(x_{i1}-\bar x_{1})^{2}+\sum_{2}(x_{i2}-\bar x_{2})^{2} +⋯+\sum_{k}(x_{ik}-\bar x_{k})^{2}=[(45-42.8)^2+(42-42.8)^2+⋯+(56-49.7)^2 ]=715.5$
$SS_{subject}$ is calculated by: $SS_{subject}=k\sum(\bar x_{i}-\bar x )^{2}$ where $\bar x_{i}$ is the mean score of the $ith$ subject, $\bar x$ is the grand mean.
In our example, we have $SS_{subject}=k\sum(\bar x_{i})-\bar x^{2}=3[(50-45.9)^2+(40-45.9)^2+(38-45.9)^2+(55-45.9)^2+(49.7-45.9)^2 ]=658.3.$
Thus,
$SS_{Error}= SS_{w}-SS_{subject}=715.5-658.3=57.2$
$F=MS_{Time}⁄MS_{Error} =(SS_{Time}⁄df_{Time})⁄(SS_{Error}⁄df_{Error})=$
$=(SS_{Time}⁄(k-1))⁄(SS_{Error}⁄(n-1)(k-1))=$
$(146.44⁄((3-1))⁄(57.2⁄(5*2))=12.53$
We can now look up (or use a computer program) to ascertain the critical F-statistic for our F-distribution with our degrees of freedom for time $df_{Time}$ and error $df_{Error}$ and determine whether our F-statistic indicates a statistically significant result.
We report the F-statistic from a repeated measures ANOVA as:
$F(df_{Time}, df_{Error} = F-value, p = p-value,$
which for our example would be:
$F(2, 10) = 12.53, p = 0.002,$ see [SOCR Java F-distribution calculator http://socr.ucla.edu/htmls/dist/Fisher_Distribution.html], or [the Distributome HTML5 F-distribution Calculator www.distributome.org/V3/calc/FCalculator.html]
- SOCR Home page: http://www.socr.umich.edu
Translate this page: