Difference between revisions of "AP Statistics Curriculum 2007 Infer 2Means Indep"
(→Large Samples) |
|||
Line 15: | Line 15: | ||
====Large Samples==== | ====Large Samples==== | ||
− | *Significance Testing: We have a standard null-hypothesis <math>H_o: \mu_X -\mu_Y = \mu_o</math> (e.g., <math>\mu_o=0</math>). Then the test statistics is: | + | *'''Significance Testing''': We have a standard null-hypothesis <math>H_o: \mu_X -\mu_Y = \mu_o</math> (e.g., <math>\mu_o=0</math>). Then the test statistics is: |
: <math>Z_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim N(0,1)</math>. | : <math>Z_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim N(0,1)</math>. | ||
: <math>z_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}</math> | : <math>z_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}</math> | ||
− | * Confidence Intervals: <math>(1-\alpha)100%</math> confidence interval for <math>\mu_1-\mu_2</math> will be | + | * '''Confidence Intervals''': <math>(1-\alpha)100%</math> confidence interval for <math>\mu_1-\mu_2</math> will be |
: <math>CI(\alpha): \overline{x}-\overline{y} \pm z_{\alpha\over 2} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm z_{\alpha\over 2} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}</math>. Note that the <math>SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}</math>, as the samples are independent. Also, <math>z_{\alpha\over 2}</math> is the [[AP_Statistics_Curriculum_2007_Normal_Critical | critical value]] for a [[AP_Statistics_Curriculum_2007_Normal_Std |Standard Normal]] distribution at <math>{\alpha\over 2}</math>. | : <math>CI(\alpha): \overline{x}-\overline{y} \pm z_{\alpha\over 2} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm z_{\alpha\over 2} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}</math>. Note that the <math>SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}</math>, as the samples are independent. Also, <math>z_{\alpha\over 2}</math> is the [[AP_Statistics_Curriculum_2007_Normal_Critical | critical value]] for a [[AP_Statistics_Curriculum_2007_Normal_Std |Standard Normal]] distribution at <math>{\alpha\over 2}</math>. | ||
====Small Samples==== | ====Small Samples==== | ||
+ | *'''Significance Testing''': Again, we have a standard null-hypothesis <math>H_o: \mu_X -\mu_Y = \mu_o</math> (e.g., <math>\mu_o=0</math>). Then the test statistics is: | ||
+ | : <math>T_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim T(df)</math>. | ||
+ | : The ''degrees of freedom'' is: <math>df={\left \( SE^2(\overline{x})+SE^2(\overline{x}) \right \)^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.</math> Always round up the degrees of freedom to the next larger integer. | ||
+ | : <math>t_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}</math> | ||
+ | * '''Confidence Intervals''': <math>(1-\alpha)100%</math> confidence interval for <math>\mu_1-\mu_2</math> will be | ||
+ | : <math>CI(\alpha): \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}</math>. Note that the <math>SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}</math>, as the samples are independent. | ||
+ | : The ''degrees of freedom'' is: <math>df={\left \( SE^2(\overline{x})+SE^2(\overline{x}) \right \)^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.</math> Always round up the degrees of freedom to the next larger integer. | ||
+ | |||
+ | Also, <math>t_{df, {\alpha\over 2}}</math> is the [[AP_Statistics_Curriculum_2007_Normal_Critical | critical value]] for a [[AP_Statistics_Curriculum_2007_StudentsT |Student's T]] distribution at <math>{\alpha\over 2}</math>. | ||
Revision as of 00:11, 10 February 2008
Contents
General Advance-Placement (AP) Statistics Curriculum - Inferences about Two Means: Independent Samples
In the previous section we discussed the inference on two paired random samples. Now, we show how to do inference on two independent samples.
Indepenent Samples Designs
Independent samples designs refer to design of experiments or observations where all measurements are individually independent from each other within their groups and the groups are independent. The groups may be drawn from different populations with different distribution characteristics.
Background
- Recall that for a random sample {\(X_1, X_2, X_3, \cdots , X_n\)} of the process, the population mean may be estimated by the sample average, \(\overline{X_n}={1\over n}\sum_{i=1}^n{X_i}\).
- The standard error of \(\overline{x}\) is given by \({{1\over \sqrt{n}} \sqrt{\sum_{i=1}^n{(x_i-\overline{x})^2\over n-1}}}\).
Analysis Protocol for Independent Designs
To study independent samples we would like to examine the differences between two group means. Suppose {\(X_1^1, X_2^1, X_3^1, \cdots , X_n^1\)} and {\(Y_1, Y_2, Y_3, \cdots , Y_n\)} represent the two independent samples. Then we want to study the differences of the two group means relative to the internal sample variations. If the two samples were drawn from populations that had different centers, then we would expect that the two sample averages will be distinct.
Large Samples
- Significance Testing: We have a standard null-hypothesis \(H_o: \mu_X -\mu_Y = \mu_o\) (e.g., \(\mu_o=0\)). Then the test statistics is:
\[Z_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim N(0,1)\]. \[z_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}\]
- Confidence Intervals\[(1-\alpha)100%\] confidence interval for \(\mu_1-\mu_2\) will be
\[CI(\alpha): \overline{x}-\overline{y} \pm z_{\alpha\over 2} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm z_{\alpha\over 2} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}\]. Note that the \(SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}\), as the samples are independent. Also, \(z_{\alpha\over 2}\) is the critical value for a Standard Normal distribution at \({\alpha\over 2}\).
Small Samples
- Significance Testing: Again, we have a standard null-hypothesis \(H_o: \mu_X -\mu_Y = \mu_o\) (e.g., \(\mu_o=0\)). Then the test statistics is:
\[T_o = {\overline{x}-\overline{y}-\mu_o \over SE(\overline{x}-\overline{y})} \sim T(df)\].
- The degrees of freedom is\[df={\left \( SE^2(\overline{x})+SE^2(\overline{x}) \right \)^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.\] Always round up the degrees of freedom to the next larger integer.
\[t_o= {\overline{x}-\overline{y} \over \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}}\]
- Confidence Intervals\[(1-\alpha)100%\] confidence interval for \(\mu_1-\mu_2\) will be
\[CI(\alpha): \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} SE(\overline{x}-\overline{y})= \overline{x}-\overline{y} \pm t_{df, {\alpha\over 2}} \sqrt{{1\over {n_1}} {\sum_{i=1}^{n_1}{(x_i-\overline{x})^2\over n_1-1}} + {1\over {n_2}} {\sum_{i=1}^{n_2}{(y_i-\overline{y})^2\over n_2-1}}}\]. Note that the \(SE(\overline{x} -\overline{x})=\sqrt{SE(\overline{x})+SE(\overline{y})}\), as the samples are independent.
- The degrees of freedom is\[df={\left \( SE^2(\overline{x})+SE^2(\overline{x}) \right \)^2 \over {SE^4(\overline{x}) \over n_1-1} + {SE^4(\overline{y}) \over n_2-1} } \approx n_1+n_2-2.\] Always round up the degrees of freedom to the next larger integer.
Also, \(t_{df, {\alpha\over 2}}\) is the critical value for a Student's T distribution at \({\alpha\over 2}\).
References
- SOCR Home page: http://www.socr.ucla.edu
Translate this page: