Difference between revisions of "AP Statistics Curriculum 2007 Estim Proportion"

From SOCR
Jump to: navigation, search
(Examples)
(Examples)
Line 39: Line 39:
  
 
===Examples===
 
===Examples===
*How many subjects are needed if the heart-researchers want <math>SE < 0.005</math> for a 95% CI, and have a guess based on previous research that <math>\tilde{p}= 0.04</math>?
+
====Sample-SIze Estimation====
 +
How many subjects are needed if the heart-researchers want <math>SE < 0.005</math> for a 95% CI, and have a guess based on previous research that <math>\tilde{p}= 0.04</math>?
 
: <math>n \geq {0.04(1-0.04)\over 0.005^2} - 1.96^2=1533.16 \approx 1534.</math>
 
: <math>n \geq {0.04(1-0.04)\over 0.005^2} - 1.96^2=1533.16 \approx 1534.</math>
  
* Is the gender of a second child influenced by the gender of the first child, in families with >1 kid? Research hypothesis needs to be formulated first before collecting/looking/interpreting the data that will be used to address it. Mothers whose 1<sup>st</sup> child is a girl are more likely to have a girl, as a second child, compared to mothers with boys as 1<sup>st</sup> children. Data: 20 yrs of birth records of 1 Hospital in Auckland, New Zealand.
+
====Siblings Genders====
 
+
Is the gender of a second child influenced by the gender of the first child, in families with >1 kid? Research hypothesis needs to be formulated first before collecting/looking/interpreting the data that will be used to address it. Mothers whose 1<sup>st</sup> child is a girl are more likely to have a girl, as a second child, compared to mothers with boys as 1<sup>st</sup> children. Data: 20 yrs of birth records of 1 Hospital in Auckland, New Zealand.
 
<center>
 
<center>
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
 
{| class="wikitable" style="text-align:center; width:75%" border="1"
Line 57: Line 58:
 
|-
 
|-
 
|  '''Total''' || 5,822 || 5,568 || 11,390  
 
|  '''Total''' || 5,822 || 5,568 || 11,390  
 +
|}
 +
</center>
 +
 +
Let <math>p_1</math>=true proportion of girls in mothers with girl as first child, <math>p_2</math>=true proportion of girls in mothers with boy as first child. The parameter of interest is <math>p_1- p_2</math>. Hypotheses:
 +
: <math>H_o: p_1- p_2=0</math> (skeptical reaction). <math>H_1: p_1- p_2>0</math> (research hypothesis).
 +
<center>
 +
{| class="wikitable" style="text-align:center; width:75%" border="1"
 +
|-
 +
| colspan=1 rowspan=2|&nbsp;
 +
| colspan=2| '''Second Child'''
 +
|-
 +
|  Number of births || Number of girls || '''Proportion'''
 +
|-
 +
| rowspan=2| '''Group''' || 1 (Previous child was girl) ||  5412||2792 || 0.516
 +
|-
 +
|  2 (Previous child was boy) || 5978|| 2776 || 0.464
 
|}
 
|}
 
</center>
 
</center>

Revision as of 16:55, 6 February 2008

General Advance-Placement (AP) Statistics Curriculum - Estimating a Population Proportion

Estimating a Population Proportion

When the sample size is large, the sampling distribution of the sample proportion \(\hat{p}\) is approximately Normal, by CLT, as the sample proportion may be presented as a sample average or Bernoulli random variables. When the sample size is small, the normal approximation may be inadequate. To accommodate this we will modify the sample-proportion \(\hat{p}\) slightly and obtain the corrected-sample-proportion \(\tilde{p}\): \[\hat{p}={y\over n} \longrightarrow \tilde{y}={y+0.5z_{\alpha \over 2}^2 \over n+z_{\alpha \over 2}^2},\] where \(z_{\alpha \over 2}\) is the normal critical value we saw earlier.

The standard error of \(\hat{p}\) also needs a slight modification \[SE_{\hat{p}} = \sqrt{\hat{p}(1-\hat{p})\over n} \longrightarrow SE_{\tilde{p}} = \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2}.\]

Confidence intervals for proportions

The confidence intervals for the sample proportion \(\hat{p}\) and the corrected-sample-proportion \(\tilde{p}\) are given by \[\hat{p}\pm z_{\alpha\over 2} SE_{\hat{p}}\]

\[\tilde{p}\pm z_{\alpha\over 2} SE_{\tilde{p}}\]

Example

Suppose a researcher is interested in studying the effect of aspirin in reducing heart attacks. He randomly recruits 500 subjects with evidence of early heart disease and has them take one aspirin daily for two years. At the end of the two years he finds that during the study only 17 subjects had a heart attack. Calculate a 95% (\(\alpha=0.05\)) confidence interval for the true (unknown) proportion of subjects with early heart disease that have a heart attack while taking aspirin daily. Note that \(z_{\alpha \over 2} = z_{0.025}=1.96\):

\[\hat{p} = {17\over 500}=0.034\] ; \(\tilde{p} = {17+0.5z_{0.025}^2\over 500+z_{0.025}^2}== {17+1.92\over 500+3.84}=0.038\)

\[SE_{\hat{p}}= \sqrt{0.034(1-0.034)\over 500}=0.0036\]; \(SE_{\tilde{p}}= \sqrt{0.038(1-0.038)\over 500+3.84}=0.0085\)

And the corresponding confidence intervals are given by \[\hat{p}\pm 1.96 SE_{\hat{p}}=[0.026944, 0.041056]\]

\[\tilde{p}\pm 1.96 SE_{\tilde{p}}=[0.0213, 0.0547]\]

Sample-size estimation

For a given margin of error we can derive the minimum sample-size that guarantees an interval estimate within the given margin of error. The margin of error is the standard-error of the sample-proportion:

\[SE_{\tilde{p}} = \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2}.\]

This equation has one unknown parameter (n), which we can solve for if we are given an upper limit for the margin of error.

\[SE_{\tilde{p}} \geq \sqrt{\tilde{p}(1-\tilde{p})\over n+z_{\alpha \over 2}^2} \longrightarrow n \geq {\tilde{p}(1-\tilde{p})\over {SE_{\tilde{p}}^2} } -z_{\alpha \over 2}^2.\]

Examples

Sample-SIze Estimation

How many subjects are needed if the heart-researchers want \(SE < 0.005\) for a 95% CI, and have a guess based on previous research that \(\tilde{p}= 0.04\)? \[n \geq {0.04(1-0.04)\over 0.005^2} - 1.96^2=1533.16 \approx 1534.\]

Siblings Genders

Is the gender of a second child influenced by the gender of the first child, in families with >1 kid? Research hypothesis needs to be formulated first before collecting/looking/interpreting the data that will be used to address it. Mothers whose 1st child is a girl are more likely to have a girl, as a second child, compared to mothers with boys as 1st children. Data: 20 yrs of birth records of 1 Hospital in Auckland, New Zealand.

  Second Child
Male Female Total
First Child Male 3,202 2,776 5,978
Female 2,620 2,792 5,412
Total 5,822 5,568 11,390

Let \(p_1\)=true proportion of girls in mothers with girl as first child, \(p_2\)=true proportion of girls in mothers with boy as first child. The parameter of interest is \(p_1- p_2\). Hypotheses: \[H_o: p_1- p_2=0\] (skeptical reaction). \(H_1: p_1- p_2>0\) (research hypothesis).

  Second Child
Number of births Number of girls Proportion
Group 1 (Previous child was girl) 5412 2792 0.516
2 (Previous child was boy) 5978 2776 0.464

References

  • TBD



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif