SMHS ANOVA
Contents
Scientific Methods for Health Sciences - Analysis of Variance (ANOVA)
Overview
Analysis of Variance (ANOVA) is the common method applied to analyze the differences between group means. In ANOVA, we divide the observed variance into components attributed to different sources of variation. It is widely used statistical technique which provides a statistical test of whether or not the means of several groups are equal, that is ANOVA can be thought as a generalized t-test for more than 2 groups (ANOVA results in the case of 2 groups coincide with the corresponding results of a 2-sample independent t-test). Here we introduce the ANOVA method, specifically one-way ANOVA and two-way ANOVA, with examples.
Motivation
In the previous two-sample inference, we applied the independent t-test to compare two independent group means. What if we want to compare k (k>2) independent samples? In this case, we will need to decompose the entire variation into components allowing us to analyze the variance of the entire dataset. Suppose 5 varieties of products are tested for further study. A filed was divided into 20 plots, with each variety planted in four plots. The measurements are shown in the table below:
A | B | C | D | E |
26.2 | 29.2 | 29.1 | 21.3 | 20.1 |
24.3 | 28.1 | 30.8 | 22.4 | 19.3 |
21.8 | 27.3 | 33.9 | 24.3 | 19.9 |
28.1 | 31.2 | 32.8 | 21.8 | 22.1 |
A | 26.2,24.3,21.8,28.1 |
B | 29.2,28.1,27.3,31.2 |
C | 29.1,30.8,33.9,32.8 |
D | 21.3,22.4,24.3,21.8 |
E | 20.1,19.3,19.9,22.1 |
Using ANOVA, the data are regarded as random samples from k populations. Suppose the population means of the sample are denoted as $\mu_{1},\mu_{2},\mu_{3},\mu_{4},\mu_{5}$and their population standard deviation are denoted as $\sigma_{1},\sigma_{2},\sigma_{3},\sigma_{4},\sigma_{5}$. An obvious method is to do $\binom{5}{2}=10$ separate t-tests and compare all independent pairs of groups. In this case, ANOVA would be much easier and powerful.
Theory
One-way ANOVA: we expand our inference methods to study and compare k independent samples. In this case, we will be decomposing the entire variation in the data into independent components.
- Notations first: $y_{ij}$ is the measurement from group $i$, observation index $j$; $k$ is the number of groups; $n_{i}$ is the number of observations in group $i$; $n$ is the total number of observations and $n=n_{1}+n_{2}+⋯+n_{k}$. The group mean for group $i$ is $\bar y_{l}$=$\frac{\sum_{j=1}^{n_{i}} y_{ij}} {n_{i}}$, the gran mean is $\bar y =\bar y_{..}=$ $\frac{\sum_{i=1}^{k}\sum_{j=1}^{n}_{i}y_{ij}}{n}
- SOCR Home page: http://www.socr.umich.edu
Translate this page: