Difference between revisions of "AP Statistics Curriculum 2007 Distrib Dists"

Revision as of 13:13, 13 May 2009

General Advance-Placement (AP) Statistics Curriculum - Geometric, HyperGeometric, Negative Binomial Random Variables and Experiments

Geometric

Definition: The Geometric Distribution is the probability distribution of the number X of Bernoulli trials needed to get one success, supported on the set {1, 2, 3, ...}. The name geometric is a direct derivative from the mathematical notion of geometric series.

Mass Function: If the probability of successes on each trial is P(success)=p, then the probability that x trials are needed to get one success is \(P(X = x) = (1 - p)^{x-1} \times p\), for x = 1, 2, 3, 4,....

Expectation: The Expected Value of a geometrically distributed random variable X is \({1\over p}.\)

Variance: The Variance is \({1-p\over p^2}.\)

Example: See this SOCR Geometric distribution activity.

HyperGeometric

The hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement. An experimental design for using Hypergeometric distribution is illustrated in this table:

Type	Drawn	Not-Drawn	Total
Defective	k	m-k	m
Non-Defective	n-k	N+k-n-m	N-m
Total	n	N-n	N

Explanation: Suppose there is a shipment of N objects in which m are defective. The Hypergeometric Distribution describes the probability that in a sample of n distinctive objects drawn from the shipment exactly k objects are defective.

Mass function: The random variable X follows the Hypergeometric Distribution with parameters N, m and n, then the probability of getting exactly k successes is given by

\[ P(X=k) = {{{m \choose k} {{N-m} \choose {n-k}}}\over {N \choose n}}.\]

This formula for the Hypergeometric Mass Function may be interpreted as follows: There are \({{N}\choose{n}}\) possible samples (without replacement). There are \({{m}\choose{k}}\) ways to obtain k defective objects and there are \({{N-m}\choose{n-k}}\) ways to fill out the rest of the sample with non-defective objects.

The mean and variance of the hypergeometric distribution have the following closed forms:

Mean\[n \times m\over N\]

Variance\[{ {nm\over N} ( 1-{m\over N} ) (N-n)\over N-1}\]

Examples

SOCR Activity: The SOCR Ball and Urn Experiment provides a hands-on demonstration of the utilization of Hypergeometric distribution in practice. This activity consists of selecting n balls at random from an urn with N balls, R of which are red and the other N - R green. The number of red balls Y in the sample is recorded on each update. The distribution and moments of Y are shown in blue in the distribution graph and are recorded in the distribution table. On each update, the empirical density and moments of Y are shown in red in the distribution graph and are recorded in the distribution table. Either of two sampling models can be selected with the list box: with replacement and without replacement. The parameters N, R, and n can be varied with scroll bars.

A lake contains 1,000 fish; 100 are randomly caught and tagged. Suppose that later we catch 20 fish. Use SOCR Hypergeometric Distribution to:
- Compute the probability mass function of the number of tagged fish in the sample of 20.
- Compute the expected value and the variance of the number of tagged fish in this sample.
- Compute the probability that this random sample contains more than 3 tagged fish.

Hypergeometric distribution may also be used to estimate the population size: Suppose we are interested in determining the population size. Let N = number of fish in a particular isolated region. Suppose we catch, tag and release back M=200 fish. Several days later, when the fish are randomly mixed with the untagged fish, we take a sample of n=100 and observe m=5 tagged fish. Suppose p=200/N is the population proportion of tagged fish. Notice that when sampling fish we sample without replacement. Thus, hypergeometric is the exact model for this process. Assuming the sample-size (n) is < 5% of the population size(N), we can use binomial approximation to hypergeometric. Thus if the sample of n=100 fish had 5 tagged, the sample-proportion (estimate of the population proportion) will be \(\hat{p}={5\over 100}=0.05\). Thus, we can estimate that \(0.05=\hat{p}={200\over N}\), and \(N\approx 4,000\), as shown on the figure below.

You can also see a manual calculation example using the hypergeometric distribution here.

Negative Binomial

The family of Negative Binomial Distributions is a two-parameter family; p and r with 0 < p < 1 and r > 0. There are two (identical) combinatorial interpretations of Negative Binomial processes (X or Y).

X=Trial index (n) of the r^th success, or Total # of experiments (n) to get r successes

Probability Mass Function\[ P(X=n) = {n-1 \choose r-1}\cdot p^r \cdot (1-p)^{n-r} \!\], for n = r,r+1,r+2,.... (n=trial number of the r^th success)
Mean\[E(X)= {r \over p}\]
Variance\[Var(X)= {r(1-p) \over p^2}\]

Y = Number of failures (k) to get r successes

Probability Mass Function\[ P(Y=k) = {k+r-1 \choose k}\cdot p^r \cdot (1-p)^k \!\], for k = 0,1,2,.... (k=number of failures before the r^th successes)
\(Y \sim NegBin(r, p)\), the probability of k failures and r successes in n=k+r Bernoulli(p) trials with success on the last trial.
Mean\[E(Y)= {r(1-p) \over p}\].
Variance\[Var(Y)= {r(1-p) \over p^2}\].
Note that X = Y + r, and E(X) = E(Y) + r, whereas VAR(X)=VAR(Y).

SOCR Negative Binomial Experiment

Application

Suppose Jane is promoting and fund-raising for a presidential candidate. She wants to visit all 50 states and she's pledged to get all electoral votes of 6 states before she and the candidate she represents are satisfied. In every state, there is a 30% chance that Jane will be able to secure all electoral votes and 70% chance that she'll fail.

What's the probability mass function of the number of failures (k=n-r) to get r=6 successes?

In other words, What's the probability mass function that the last 6^th state she succeeds to secure all electoral votes happens to be the at the n^th state she campaigns in?

NegBin(r, p) distribution describes the probability of k failures and r successes in n=k+r Bernoulli(p) trials with success on the last trial. Looking to secure the electoral votes for 6 states means Jane needs to get 6 successes before she (and her candidate) is happy. The number of trials (i.e., states visited) needed is n=k+6. The random variable we are interested in is X={number of states visited to achieve 6 successes (secure all electoral votes within these states)}. So, n = k+6, and \(X\sim NegBin(r=6, p=0.3)\). Thus, for \(n \geq 6\), the mass function (giving the probabilities that Jane will visit n states before her ultimate success is:

\[ P(X=n) = {n-1 \choose r-1}\cdot p^r \cdot (1-p)^{n-r} = {n - 1 \choose r-1} \cdot 0.3^6 \cdot 0.7^{n-r} \]

What's the probability that Jane finishes her campaign in the 10^th state?

Let \(X\sim NegBin(r=6, p=0.3)\), then \(P(X=10) = {10-1 \choose 6-1}\cdot 0.3^6 \cdot 0.7^{10-6} = 0.022054.\)

What's the probability that Jane finishes campaigning on or before reaching the 8^th state?

\[ P(X\leq 8) = 0.011292\]

Error creating thumbnail: File missing

Suppose the success of getting all electoral votes within a state is reduced to only 10%, then X~NegBin(r=6, p=0.1). Notice that the shape and domain the Negative-Binomial distribution significantly chance now (see image below)!

What's the probability that Jane covers all 50 states but fails to get all electoral votes in any 6 states (as she had hoped for)?

\[ P(X\geq 50) = 0.632391\]

SOCR Activity: If you want to see an interactive Negative-Binomial Graphical calculator you can go to this applet (select Negative Binomial) and see this activity.

References

Negative-Binomial Activity

SOCR Home page: http://www.socr.ucla.edu

Translate this page:

(default)	Deutsch	Español	Français	Italiano	Português	日本語	България	الامارات العربية المتحدة	Suomi	इस भाषा में	Norge
한국어	中文	繁体中文	Русский	Nederlands	Ελληνικά	Hrvatska	Česká republika	Danmark	Polska	România	Sverige

@@ Line 77: / Line 77: @@
 : In other words, ''What's the probability mass function that the last 6<sup>th</sup> state she succeeds to secure all electoral votes happens to be the at the ''n''<sup>th</sup> state she campaigns in?''
-NegBin(''r'', ''p'') distribution describes the probability of ''k'' failures and ''r'' successes in ''n''=''k''+''r'' Bernoulli(''p'') trials with success on the last trial.  Looking to secure the electoral votes for 6 states means Jane needs to get 6 successes before she (and her candidate) is happy.  The number of trials (i.e., states visited) needed is ''n''=''k+6''.  The random variable we are interested in is '''X={number of states visited to achieve 6 successes (secure all electoral votes within these states)}'''. So, ''k'' = ''n-6'', and <math>X\sim NegBin(r=6, p=0.3)</math>. Thus, for <math>n \geq 6</math>, the mass function (giving the probabilities that Jane will visit n states before her ultimate success is:
+NegBin(''r'', ''p'') distribution describes the probability of ''k'' failures and ''r'' successes in ''n''=''k''+''r'' Bernoulli(''p'') trials with success on the last trial.  Looking to secure the electoral votes for 6 states means Jane needs to get 6 successes before she (and her candidate) is happy.  The number of trials (i.e., states visited) needed is ''n''=''k+6''.  The random variable we are interested in is '''X={number of states visited to achieve 6 successes (secure all electoral votes within these states)}'''. So, ''n'' = ''k+6'', and <math>X\sim NegBin(r=6, p=0.3)</math>. Thus, for <math>n \geq 6</math>, the mass function (giving the probabilities that Jane will visit n states before her ultimate success is:
-:<math> P(Y=k) = {k+r-1 \choose k}\cdot p^r \cdot (1-p)^k = {k + 6 - 1 \choose k} \cdot 0.3^6 \cdot 0.7^{k} </math>
+:<math> P(X=n) = {n-1 \choose r-1}\cdot p^r \cdot (1-p)^{n-r} = {n - 1 \choose r-1} \cdot 0.3^6 \cdot 0.7^{n-r} </math>
-: <math>= {k+5 \choose k} \cdot \frac{3^6 \cdot 7^{k}}{10^{k+6}} </math>
 * ''What's the probability that Jane finishes her campaign in the 10<sup>th</sup> state?''
-: Let <math>Y\sim NegBin(r=6, p=0.3)</math>, then <math>P(Y=10) = {10+6-1 \choose 10}\cdot 0.3^6 \cdot 0.7^{10} = 0.022054.</math>
+: Let <math>X\sim NegBin(r=6, p=0.3)</math>, then <math>P(X=10) = {10-1 \choose 6-1}\cdot 0.3^6 \cdot 0.7^{10-6} = 0.022054.</math>
 <center>[[Image:SOCR_EBook_Dinov_RV_NegBinomial_013008_Fig4.jpg|500px]]</center>
@@ Line 91: / Line 90: @@
 <center>[[Image:SOCR_EBook_Dinov_RV_NegBinomial_013008_Fig5.jpg|500px]]</center>
-* Suppose the success of getting all electoral votes within a state is reduced to only 10%, then '''X~NegBin(r=6, p=0.1)'''. Notice that the shape and domain the Negative-Binomial distribution significantly chance now (see image below)! ''What's the probability that Jane covers all 50 states but fails to get all electoral votes in any 6 states (as she had hoped for)?''
+* Suppose the success of getting all electoral votes within a state is reduced to only 10%, then '''X~NegBin(r=6, p=0.1)'''. Notice that the shape and domain the Negative-Binomial distribution significantly chance now (see image below)!
+: ''What's the probability that Jane covers all 50 states but fails to get all electoral votes in any 6 states (as she had hoped for)?''
 :<math> P(X\geq 50) = 0.632391</math>
 <center>[[Image:SOCR_EBook_Dinov_RV_NegBinomial_013008_Fig6.jpg|500px]]</center>

Difference between revisions of "AP Statistics Curriculum 2007 Distrib Dists"

Revision as of 13:13, 13 May 2009

Contents

General Advance-Placement (AP) Statistics Curriculum - Geometric, HyperGeometric, Negative Binomial Random Variables and Experiments

Geometric

HyperGeometric

Examples

Negative Binomial

X=Trial index (n) of the r^th success, or Total # of experiments (n) to get r successes

Y = Number of failures (k) to get r successes

SOCR Negative Binomial Experiment

Application

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

Difference between revisions of "AP Statistics Curriculum 2007 Distrib Dists"

Revision as of 13:13, 13 May 2009

Contents

General Advance-Placement (AP) Statistics Curriculum - Geometric, HyperGeometric, Negative Binomial Random Variables and Experiments

Geometric

HyperGeometric

Examples

Negative Binomial

X=Trial index (n) of the rth success, or Total # of experiments (n) to get r successes

Y = Number of failures (k) to get r successes

SOCR Negative Binomial Experiment

Application

References

Navigation menu

Search

X=Trial index (n) of the r^th success, or Total # of experiments (n) to get r successes