SOCR - User contributions [en]

AP Statistics Curriculum 2007 Prob Rules

2010-06-10T19:04:39Z

ChiccoChou: /* Bayesian Rule */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Probability Theory Rules==

=== Addition Rule===
The probability of a union, also called the [http://en.wikipedia.org/wiki/Inclusion-exclusion_principle Inclusion-Exclusion principle] allows us to compute probabilities of composite events represented as unions (i.e., sums) of simpler events.
[[Image:500px-Inclusion-exclusion.svg.png|150px|thumbnail|right| [http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Inclusion-exclusion.svg/180px-Inclusion-exclusion.svg.png Venn Diagrams]]]

For events ''A''1, ..., ''A''''n'' in a probability space (S,P), the probability of the union for ''n=2'' is
:<math>P(A_1\cup A_2)=P(A_1)+P(A_2)-P(A_1\cap A_2),</math>

For ''n=3'',
:<math>P(A_1\cup A_2\cup A_3)=P(A_1)+P(A_2)+P(A_3) -P(A_1\cap A_2)-P(A_1\cap A_3)-P(A_2\cap A_3)+P(A_1\cap A_2\cap A_3)</math>

In general, for any ''n'',
:<math>P(\bigcup_{i=1}^n A_i) =\sum_{i=1}^n {P(A_i)}
-\sum_{i,j\,:\,i<j}{P(A_i\cap A_j)} +\sum_{i,j,k\,:\,i<j<k}{P(A_i\cap A_j\cap A_k)}+ \cdots</math>
:<math>\cdots\ +(-1)^{m+1} \sum_{i_1<i_2< \cdots < i_m}{P(\bigcap_{p=1}^m A_{i_p})}+ \cdots\cdots\ +(-1)^{n+1} P(\bigcap_{i=1}^n A_i).</math>

===Conditional Probability===
The conditional probability of A occurring given that B occurs is given by
:<math>P(A | B) ={P(A \cap B) \over P(B)}.</math>

* See this [[SOCR_EduMaterials_Activities_BivariateNormalExperiment#Applications_-__Conditional_Probability | demonstration of conditional probability using the SOCR Bivariate Normal Experiment]].

* Also see [[SOCR_EduMaterials_Activities_CoinDieExperiment#Applications:_Conditional_Probability | demonstration of conditional probability using the SOCR Coin-Die Experiment]].

===Examples===
====Contingency table====
Here is the data on 400 Melanoma (skin cancer) Patients by Type and Site
<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
| rowspan="2"|Type || colspan="3" align="center"|Site || rowspan="2"|Totals
|-
| Head and Neck || Trunk || Extremities
|-
| Hutchinson's melanomic freckle || 22 || 2 || 10 || 34
|-
| Superficial || 16 || 54 || 115 || 185
|-
| Nodular || 19 || 33 || 73 || 125
|-
| Indeterminant || 11 || 17 || 28 || 56
|-
| Column Totals || 68 || 106 || 226 || 400
|}
</center>

* Suppose we select one out of the 400 patients in the study and we want to find the probability that the cancer is on the extremities ''given'' that it is of type nodular: P = 73/125 = P(Extremities | Nodular)

* What is the probability that for a randomly chosen patient the cancer type is Superficial given that it appears on the Trunk?

====Monty Hall Problem====
Recall that earlier we discussed the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | Monty Hall Experiment]]. We will now show why the odds of winning double if we use the swap strategy - that is the probability of a win is 2/3, if each time we switch and choose the last third card.

Denote W={Final Win of the Car Price}. Let L1 and W2 represent the events of choosing the donkey (loosing) and the car (winning) at the player's first and second choice, respectively. Then, the chance of winning in the swapping-strategy case is:
<math>P(W) = P(L_1 \cap W_2) = P(W_2 | L_1) P(L_1) = 1 \times {2\over 3} ={2\over 3}</math>. If we played using the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | stay-home strategy]], our chance of winning would have been:
<math>P(W) = P(W_1 \cap W_2) = P(W_2 | W_1) P(W_1) = 1 \times {1\over 3} ={1\over 3}</math>, or half the chance in the first (swapping) case.

====Drawing balls without replacement====
Suppose we draw 2 balls at random, one at a time without replacement from an urn containing 4 black and 3 white balls, otherwise identical. What is the probability that the second ball is black? Sample Space?
P({2-nd ball is black}) = P({2-nd is black} &{1-st is black}) + P({2-nd is black} &{1-st is white}) = 4/7 x 3/6 + 4/6 x 3/7 = 4/7.

===Inverting the order of conditioning===
In many practical situations it is beneficial to be able to swap the event of interest and the conditioning event when we are computing probabilities. This can easily be accomplished using this trivial, yet powerful, identity:
<center> <math>P(A \cap B) = P(A | B) \times P(B) = P(B | A) \times P(A)</math></center>

===Example - inverting conditioning===
Suppose we classify the entire female population into 2 Classes: healthy(NC) controls and cancer patients. If a woman has a positive mammogram result, what is the probability that she has breast cancer?

Suppose we obtain medical evidence for a subject in terms of the results of her mammogram (imaging) test: positive or negative mammogram . If P(Positive Test) = 0.107, P(Cancer) = 0.1, P(Positive test | Cancer) = 0.8, then we can easily calculate the probability of real interest - what is the chance that the subject has cancer:
<center><math>P(Cancer | Positive Test) = {P(Positive Test | Cancer) \times P(Cancer) \over P(Positive Test)}= {0.8\times 0.1 \over 0.107}</math> </center>

This equation has 3 known parameters and 1 unknown variable, so, we can solve for P(Cancer | Positive Test) to determine the chance the patient has breast cancer given that her mammogram was positively read. This probability, of course, will significantly influence the treatment action recommended by the physician.

===Statistical Independence===
Events A and B are '''statistically independent''' if knowing whether B has occurred gives no new information about the chances of A occurring, i.e., if P(A | B) = P(A).

Note that if A is independent of B, then B is also independent of A, i.e., P(B | A) = P(B), since <math>P(B|A)={P(B \cap A) \over P(A)} = {P(A|B)P(B) \over P(A)} = P(B)</math>.

If A and B are statistically independent, then <math>P(B \cap A) = P(A) \times P(B).</math>

=== Multiplication Rule===

For any two events (whether dependent or independent):
<center><math>P(A \cap B) = P(B|A)P(A) = P(A|B)P(B).</math></center>

In general, for any collection of events:
<center><math>P(A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_n) = P(A_1)P(A_2|A_1)P(A_3|A_1 \cap A_2)P(A_4|A_1 \cap A_2 \cap A_3) \cdots </math>
<math>\cdots P(A_{n-1}|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-2})P(A_n|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-1})</math></center>

===Law of total probability===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} partition the sample space ''S'' (i.e., all events are mutually exclusive and <math>\cup_{i=1}^n {A_i}=S</math>) then for any event B
<center><math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)
</math></center>

[[Image:SOCR_EBook_Dinov_Probability_012808_Fig2.jpg|150px|thumbnail|right]]

* Example, if <math>A_1</math> and <math>A_2</math> partition the sample space (think of males and females), then the probability of any event B (e.g., smoker) may be computed by:
<math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>. This of course is a simple consequence of the fact that <math>P(B) = P(B\cap S) = P(B \cap (A_1 \cup A_2))</math>. Therefore,
<math>P(B)=P((B \cap A_1) \cup (B \cap A_2))= P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>.

===Bayesian Rule===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} partition the sample space ''S'' and A and B are any events (subsets of S), then:
<center><math>P(A | B) = {P(B | A) P(A) \over P(B)} = {P(B | A) P(A) \over P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)}.</math></center>

===Independence vs. disjointness/mutual-exclusiveness===
: The events A and B are ''independent'' if P(A|B)=P(A). That is <math>P(A \cap B) = P(A)P(B).</math>
: The events C and D are ''disjoint, or mutually-exclusive'', if <math>P(C\cap D) = 0</math>. That is <math>P(C\cup D)=P(C)+P(D).</math>
Mutual-exclusiveness and independence are different concepts. Here are two examples clarifying the differences between these concepts:
* Suppose we play a [[SOCR_EduMaterials_Activities_CardExperiment | card game]] of ''guessing the color of a randomly drawn card'' from a [[AP_Statistics_Curriculum_2007_Prob_Simul#Poker_Game |standard 52-card deck]]. As there are 2 possible colors (black and red), and given no other information, the chance for correctly guessing the color (e.g., black) is 0.5. However, additional information may or may not be helpful in identifying the card color. For example:
** If we know that the [[AP_Statistics_Curriculum_2007_Prob_Count#Hands-on_combination_activity |card denomination]] is a king, there are 2 red and 2 black kings, this does '''not''' help us improve our chances of successfully identifying the correct color of the card, P(Red|King)=P(Red), independence.
** If we know that the suit of the card is hearts, this does help is with correctly identifying the card color (as hearts are red), P(Red|Hearts)=1.0, strong dependence.
** Notes:
:: In both cases, the events A={Red} and B={King} and C={Hearts} are '''not''' mutually exclusive (disjoint)!
:: Events that are mutually exclusive (disjoint) cannot be independent!
* [http://en.wikipedia.org/wiki/Color_blindness Color blindness] is a sex-linked trait, as many of the genes involved in color vision are on the [http://en.wikipedia.org/wiki/X_chromosome X chromosome]. Color blindness more common in males than in females, as men do not have a second X chromosome to overwrite the chromosome which carries the mutation. If 8% of variants of a given gene are defective (mutated), the probability of a single copy being defective is 8%, but the probability that two (independent) copies are both defective is 0.08 × 0.08 = 0.0064.
: The events A={Female} and B={Color blind} are not mutually exclusive (females can be color blind), nor they are independent (the rate of color blindness among females is lower). Color blindness prevalence within the 2 genders is P(CB|Male) = 0.08, and P(CB|Female)=0.005, where CB={color blind, one color, a color combination, or another mutation}.

===Example===
Suppose a Laboratory blood test is used as evidence for a disease. Assume P(positive Test| Disease) = 0.95, P(positive Test| no Disease)=0.01 and P(Disease) = 0.005. Find P(Disease|positive Test)=?

Denote D = {the test person has the disease}, <math>D^c</math> = {the test person does not have the disease} and T = {the test result is positive}. Then
<center><math>P(D | T) = {P(T | D) P(D) \over P(T)} = {P(T | D) P(D) \over P(T|D)P(D) + P(T|D^c)P(D^c)}=</math>
<math>={0.95\times 0.005 \over {0.95\times 0.005 +0.01\times 0.995}}=0.193.</math></center>

===See also===
* [[AP_Statistics_Curriculum_2007_Bayesian_Prelim | Bayesian Chapter]]

===[[EBook_Problems_Prob_Rules|Problems]]===

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture03.pdf Probability Lecture Notes (PDF)]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_Prob_Rules}}

AP Statistics Curriculum 2007 Prob Rules

2010-06-10T19:04:12Z

ChiccoChou: /* Law of total probability */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Probability Theory Rules==

=== Addition Rule===
The probability of a union, also called the [http://en.wikipedia.org/wiki/Inclusion-exclusion_principle Inclusion-Exclusion principle] allows us to compute probabilities of composite events represented as unions (i.e., sums) of simpler events.
[[Image:500px-Inclusion-exclusion.svg.png|150px|thumbnail|right| [http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Inclusion-exclusion.svg/180px-Inclusion-exclusion.svg.png Venn Diagrams]]]

For events ''A''1, ..., ''A''''n'' in a probability space (S,P), the probability of the union for ''n=2'' is
:<math>P(A_1\cup A_2)=P(A_1)+P(A_2)-P(A_1\cap A_2),</math>

For ''n=3'',
:<math>P(A_1\cup A_2\cup A_3)=P(A_1)+P(A_2)+P(A_3) -P(A_1\cap A_2)-P(A_1\cap A_3)-P(A_2\cap A_3)+P(A_1\cap A_2\cap A_3)</math>

In general, for any ''n'',
:<math>P(\bigcup_{i=1}^n A_i) =\sum_{i=1}^n {P(A_i)}
-\sum_{i,j\,:\,i<j}{P(A_i\cap A_j)} +\sum_{i,j,k\,:\,i<j<k}{P(A_i\cap A_j\cap A_k)}+ \cdots</math>
:<math>\cdots\ +(-1)^{m+1} \sum_{i_1<i_2< \cdots < i_m}{P(\bigcap_{p=1}^m A_{i_p})}+ \cdots\cdots\ +(-1)^{n+1} P(\bigcap_{i=1}^n A_i).</math>

===Conditional Probability===
The conditional probability of A occurring given that B occurs is given by
:<math>P(A | B) ={P(A \cap B) \over P(B)}.</math>

* See this [[SOCR_EduMaterials_Activities_BivariateNormalExperiment#Applications_-__Conditional_Probability | demonstration of conditional probability using the SOCR Bivariate Normal Experiment]].

* Also see [[SOCR_EduMaterials_Activities_CoinDieExperiment#Applications:_Conditional_Probability | demonstration of conditional probability using the SOCR Coin-Die Experiment]].

===Examples===
====Contingency table====
Here is the data on 400 Melanoma (skin cancer) Patients by Type and Site
<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
| rowspan="2"|Type || colspan="3" align="center"|Site || rowspan="2"|Totals
|-
| Head and Neck || Trunk || Extremities
|-
| Hutchinson's melanomic freckle || 22 || 2 || 10 || 34
|-
| Superficial || 16 || 54 || 115 || 185
|-
| Nodular || 19 || 33 || 73 || 125
|-
| Indeterminant || 11 || 17 || 28 || 56
|-
| Column Totals || 68 || 106 || 226 || 400
|}
</center>

* Suppose we select one out of the 400 patients in the study and we want to find the probability that the cancer is on the extremities ''given'' that it is of type nodular: P = 73/125 = P(Extremities | Nodular)

* What is the probability that for a randomly chosen patient the cancer type is Superficial given that it appears on the Trunk?

====Monty Hall Problem====
Recall that earlier we discussed the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | Monty Hall Experiment]]. We will now show why the odds of winning double if we use the swap strategy - that is the probability of a win is 2/3, if each time we switch and choose the last third card.

Denote W={Final Win of the Car Price}. Let L1 and W2 represent the events of choosing the donkey (loosing) and the car (winning) at the player's first and second choice, respectively. Then, the chance of winning in the swapping-strategy case is:
<math>P(W) = P(L_1 \cap W_2) = P(W_2 | L_1) P(L_1) = 1 \times {2\over 3} ={2\over 3}</math>. If we played using the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | stay-home strategy]], our chance of winning would have been:
<math>P(W) = P(W_1 \cap W_2) = P(W_2 | W_1) P(W_1) = 1 \times {1\over 3} ={1\over 3}</math>, or half the chance in the first (swapping) case.

====Drawing balls without replacement====
Suppose we draw 2 balls at random, one at a time without replacement from an urn containing 4 black and 3 white balls, otherwise identical. What is the probability that the second ball is black? Sample Space?
P({2-nd ball is black}) = P({2-nd is black} &{1-st is black}) + P({2-nd is black} &{1-st is white}) = 4/7 x 3/6 + 4/6 x 3/7 = 4/7.

===Inverting the order of conditioning===
In many practical situations it is beneficial to be able to swap the event of interest and the conditioning event when we are computing probabilities. This can easily be accomplished using this trivial, yet powerful, identity:
<center> <math>P(A \cap B) = P(A | B) \times P(B) = P(B | A) \times P(A)</math></center>

===Example - inverting conditioning===
Suppose we classify the entire female population into 2 Classes: healthy(NC) controls and cancer patients. If a woman has a positive mammogram result, what is the probability that she has breast cancer?

Suppose we obtain medical evidence for a subject in terms of the results of her mammogram (imaging) test: positive or negative mammogram . If P(Positive Test) = 0.107, P(Cancer) = 0.1, P(Positive test | Cancer) = 0.8, then we can easily calculate the probability of real interest - what is the chance that the subject has cancer:
<center><math>P(Cancer | Positive Test) = {P(Positive Test | Cancer) \times P(Cancer) \over P(Positive Test)}= {0.8\times 0.1 \over 0.107}</math> </center>

This equation has 3 known parameters and 1 unknown variable, so, we can solve for P(Cancer | Positive Test) to determine the chance the patient has breast cancer given that her mammogram was positively read. This probability, of course, will significantly influence the treatment action recommended by the physician.

===Statistical Independence===
Events A and B are '''statistically independent''' if knowing whether B has occurred gives no new information about the chances of A occurring, i.e., if P(A | B) = P(A).

Note that if A is independent of B, then B is also independent of A, i.e., P(B | A) = P(B), since <math>P(B|A)={P(B \cap A) \over P(A)} = {P(A|B)P(B) \over P(A)} = P(B)</math>.

If A and B are statistically independent, then <math>P(B \cap A) = P(A) \times P(B).</math>

=== Multiplication Rule===

For any two events (whether dependent or independent):
<center><math>P(A \cap B) = P(B|A)P(A) = P(A|B)P(B).</math></center>

In general, for any collection of events:
<center><math>P(A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_n) = P(A_1)P(A_2|A_1)P(A_3|A_1 \cap A_2)P(A_4|A_1 \cap A_2 \cap A_3) \cdots </math>
<math>\cdots P(A_{n-1}|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-2})P(A_n|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-1})</math></center>

===Law of total probability===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} partition the sample space ''S'' (i.e., all events are mutually exclusive and <math>\cup_{i=1}^n {A_i}=S</math>) then for any event B
<center><math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)
</math></center>

[[Image:SOCR_EBook_Dinov_Probability_012808_Fig2.jpg|150px|thumbnail|right]]

* Example, if <math>A_1</math> and <math>A_2</math> partition the sample space (think of males and females), then the probability of any event B (e.g., smoker) may be computed by:
<math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>. This of course is a simple consequence of the fact that <math>P(B) = P(B\cap S) = P(B \cap (A_1 \cup A_2))</math>. Therefore,
<math>P(B)=P((B \cap A_1) \cup (B \cap A_2))= P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>.

===Bayesian Rule===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} form a partition of the sample space ''S'' and A and B are any events (subsets of S), then:
<center><math>P(A | B) = {P(B | A) P(A) \over P(B)} = {P(B | A) P(A) \over P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)}.</math></center>

===Independence vs. disjointness/mutual-exclusiveness===
: The events A and B are ''independent'' if P(A|B)=P(A). That is <math>P(A \cap B) = P(A)P(B).</math>
: The events C and D are ''disjoint, or mutually-exclusive'', if <math>P(C\cap D) = 0</math>. That is <math>P(C\cup D)=P(C)+P(D).</math>
Mutual-exclusiveness and independence are different concepts. Here are two examples clarifying the differences between these concepts:
* Suppose we play a [[SOCR_EduMaterials_Activities_CardExperiment | card game]] of ''guessing the color of a randomly drawn card'' from a [[AP_Statistics_Curriculum_2007_Prob_Simul#Poker_Game |standard 52-card deck]]. As there are 2 possible colors (black and red), and given no other information, the chance for correctly guessing the color (e.g., black) is 0.5. However, additional information may or may not be helpful in identifying the card color. For example:
** If we know that the [[AP_Statistics_Curriculum_2007_Prob_Count#Hands-on_combination_activity |card denomination]] is a king, there are 2 red and 2 black kings, this does '''not''' help us improve our chances of successfully identifying the correct color of the card, P(Red|King)=P(Red), independence.
** If we know that the suit of the card is hearts, this does help is with correctly identifying the card color (as hearts are red), P(Red|Hearts)=1.0, strong dependence.
** Notes:
:: In both cases, the events A={Red} and B={King} and C={Hearts} are '''not''' mutually exclusive (disjoint)!
:: Events that are mutually exclusive (disjoint) cannot be independent!
* [http://en.wikipedia.org/wiki/Color_blindness Color blindness] is a sex-linked trait, as many of the genes involved in color vision are on the [http://en.wikipedia.org/wiki/X_chromosome X chromosome]. Color blindness more common in males than in females, as men do not have a second X chromosome to overwrite the chromosome which carries the mutation. If 8% of variants of a given gene are defective (mutated), the probability of a single copy being defective is 8%, but the probability that two (independent) copies are both defective is 0.08 × 0.08 = 0.0064.
: The events A={Female} and B={Color blind} are not mutually exclusive (females can be color blind), nor they are independent (the rate of color blindness among females is lower). Color blindness prevalence within the 2 genders is P(CB|Male) = 0.08, and P(CB|Female)=0.005, where CB={color blind, one color, a color combination, or another mutation}.

===Example===
Suppose a Laboratory blood test is used as evidence for a disease. Assume P(positive Test| Disease) = 0.95, P(positive Test| no Disease)=0.01 and P(Disease) = 0.005. Find P(Disease|positive Test)=?

Denote D = {the test person has the disease}, <math>D^c</math> = {the test person does not have the disease} and T = {the test result is positive}. Then
<center><math>P(D | T) = {P(T | D) P(D) \over P(T)} = {P(T | D) P(D) \over P(T|D)P(D) + P(T|D^c)P(D^c)}=</math>
<math>={0.95\times 0.005 \over {0.95\times 0.005 +0.01\times 0.995}}=0.193.</math></center>

===See also===
* [[AP_Statistics_Curriculum_2007_Bayesian_Prelim | Bayesian Chapter]]

===[[EBook_Problems_Prob_Rules|Problems]]===

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture03.pdf Probability Lecture Notes (PDF)]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_Prob_Rules}}

AP Statistics Curriculum 2007 Prob Rules

2010-06-10T19:02:24Z

ChiccoChou: /* Inverting the order of conditioning */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Probability Theory Rules==

=== Addition Rule===
The probability of a union, also called the [http://en.wikipedia.org/wiki/Inclusion-exclusion_principle Inclusion-Exclusion principle] allows us to compute probabilities of composite events represented as unions (i.e., sums) of simpler events.
[[Image:500px-Inclusion-exclusion.svg.png|150px|thumbnail|right| [http://upload.wikimedia.org/wikipedia/commons/thumb/4/42/Inclusion-exclusion.svg/180px-Inclusion-exclusion.svg.png Venn Diagrams]]]

For events ''A''1, ..., ''A''''n'' in a probability space (S,P), the probability of the union for ''n=2'' is
:<math>P(A_1\cup A_2)=P(A_1)+P(A_2)-P(A_1\cap A_2),</math>

For ''n=3'',
:<math>P(A_1\cup A_2\cup A_3)=P(A_1)+P(A_2)+P(A_3) -P(A_1\cap A_2)-P(A_1\cap A_3)-P(A_2\cap A_3)+P(A_1\cap A_2\cap A_3)</math>

In general, for any ''n'',
:<math>P(\bigcup_{i=1}^n A_i) =\sum_{i=1}^n {P(A_i)}
-\sum_{i,j\,:\,i<j}{P(A_i\cap A_j)} +\sum_{i,j,k\,:\,i<j<k}{P(A_i\cap A_j\cap A_k)}+ \cdots</math>
:<math>\cdots\ +(-1)^{m+1} \sum_{i_1<i_2< \cdots < i_m}{P(\bigcap_{p=1}^m A_{i_p})}+ \cdots\cdots\ +(-1)^{n+1} P(\bigcap_{i=1}^n A_i).</math>

===Conditional Probability===
The conditional probability of A occurring given that B occurs is given by
:<math>P(A | B) ={P(A \cap B) \over P(B)}.</math>

* See this [[SOCR_EduMaterials_Activities_BivariateNormalExperiment#Applications_-__Conditional_Probability | demonstration of conditional probability using the SOCR Bivariate Normal Experiment]].

* Also see [[SOCR_EduMaterials_Activities_CoinDieExperiment#Applications:_Conditional_Probability | demonstration of conditional probability using the SOCR Coin-Die Experiment]].

===Examples===
====Contingency table====
Here is the data on 400 Melanoma (skin cancer) Patients by Type and Site
<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|-
| rowspan="2"|Type || colspan="3" align="center"|Site || rowspan="2"|Totals
|-
| Head and Neck || Trunk || Extremities
|-
| Hutchinson's melanomic freckle || 22 || 2 || 10 || 34
|-
| Superficial || 16 || 54 || 115 || 185
|-
| Nodular || 19 || 33 || 73 || 125
|-
| Indeterminant || 11 || 17 || 28 || 56
|-
| Column Totals || 68 || 106 || 226 || 400
|}
</center>

* Suppose we select one out of the 400 patients in the study and we want to find the probability that the cancer is on the extremities ''given'' that it is of type nodular: P = 73/125 = P(Extremities | Nodular)

* What is the probability that for a randomly chosen patient the cancer type is Superficial given that it appears on the Trunk?

====Monty Hall Problem====
Recall that earlier we discussed the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | Monty Hall Experiment]]. We will now show why the odds of winning double if we use the swap strategy - that is the probability of a win is 2/3, if each time we switch and choose the last third card.

Denote W={Final Win of the Car Price}. Let L1 and W2 represent the events of choosing the donkey (loosing) and the car (winning) at the player's first and second choice, respectively. Then, the chance of winning in the swapping-strategy case is:
<math>P(W) = P(L_1 \cap W_2) = P(W_2 | L_1) P(L_1) = 1 \times {2\over 3} ={2\over 3}</math>. If we played using the [[AP_Statistics_Curriculum_2007_Prob_Basics#Hands-on_activities | stay-home strategy]], our chance of winning would have been:
<math>P(W) = P(W_1 \cap W_2) = P(W_2 | W_1) P(W_1) = 1 \times {1\over 3} ={1\over 3}</math>, or half the chance in the first (swapping) case.

====Drawing balls without replacement====
Suppose we draw 2 balls at random, one at a time without replacement from an urn containing 4 black and 3 white balls, otherwise identical. What is the probability that the second ball is black? Sample Space?
P({2-nd ball is black}) = P({2-nd is black} &{1-st is black}) + P({2-nd is black} &{1-st is white}) = 4/7 x 3/6 + 4/6 x 3/7 = 4/7.

===Inverting the order of conditioning===
In many practical situations it is beneficial to be able to swap the event of interest and the conditioning event when we are computing probabilities. This can easily be accomplished using this trivial, yet powerful, identity:
<center> <math>P(A \cap B) = P(A | B) \times P(B) = P(B | A) \times P(A)</math></center>

===Example - inverting conditioning===
Suppose we classify the entire female population into 2 Classes: healthy(NC) controls and cancer patients. If a woman has a positive mammogram result, what is the probability that she has breast cancer?

Suppose we obtain medical evidence for a subject in terms of the results of her mammogram (imaging) test: positive or negative mammogram . If P(Positive Test) = 0.107, P(Cancer) = 0.1, P(Positive test | Cancer) = 0.8, then we can easily calculate the probability of real interest - what is the chance that the subject has cancer:
<center><math>P(Cancer | Positive Test) = {P(Positive Test | Cancer) \times P(Cancer) \over P(Positive Test)}= {0.8\times 0.1 \over 0.107}</math> </center>

This equation has 3 known parameters and 1 unknown variable, so, we can solve for P(Cancer | Positive Test) to determine the chance the patient has breast cancer given that her mammogram was positively read. This probability, of course, will significantly influence the treatment action recommended by the physician.

===Statistical Independence===
Events A and B are '''statistically independent''' if knowing whether B has occurred gives no new information about the chances of A occurring, i.e., if P(A | B) = P(A).

Note that if A is independent of B, then B is also independent of A, i.e., P(B | A) = P(B), since <math>P(B|A)={P(B \cap A) \over P(A)} = {P(A|B)P(B) \over P(A)} = P(B)</math>.

If A and B are statistically independent, then <math>P(B \cap A) = P(A) \times P(B).</math>

=== Multiplication Rule===

For any two events (whether dependent or independent):
<center><math>P(A \cap B) = P(B|A)P(A) = P(A|B)P(B).</math></center>

In general, for any collection of events:
<center><math>P(A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_n) = P(A_1)P(A_2|A_1)P(A_3|A_1 \cap A_2)P(A_4|A_1 \cap A_2 \cap A_3) \cdots </math>
<math>\cdots P(A_{n-1}|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-2})P(A_n|A_1 \cap A_2 \cap A_3 \cap \cdots \cap A_{n-1})</math></center>

===Law of total probability===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} form a partition of the sample space ''S'' (i.e., all events are mutually exclusive and <math>\cup_{i=1}^n {A_i}=S</math>) then for any event B
<center><math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)
</math></center>

[[Image:SOCR_EBook_Dinov_Probability_012808_Fig2.jpg|150px|thumbnail|right]]

* Example, if <math>A_1</math> and <math>A_2</math> partition the sample space (think of males and females), then the probability of any event B (e.g., smoker) may be computed by:
<math>P(B) = P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>. This of course is a simple consequence of the fact that <math>P(B) = P(B\cap S) = P(B \cap (A_1 \cup A_2))</math>. Therefore,
<math>P(B)=P((B \cap A_1) \cup (B \cap A_2))= P(B|A_1)P(A_1) + P(B|A_2)P(A_2)</math>.

===Bayesian Rule===
If {<math>A_1, A_2, A_3, \cdots, A_n</math>} form a partition of the sample space ''S'' and A and B are any events (subsets of S), then:
<center><math>P(A | B) = {P(B | A) P(A) \over P(B)} = {P(B | A) P(A) \over P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \cdots + P(B|A_n)P(A_n)}.</math></center>

===Independence vs. disjointness/mutual-exclusiveness===
: The events A and B are ''independent'' if P(A|B)=P(A). That is <math>P(A \cap B) = P(A)P(B).</math>
: The events C and D are ''disjoint, or mutually-exclusive'', if <math>P(C\cap D) = 0</math>. That is <math>P(C\cup D)=P(C)+P(D).</math>
Mutual-exclusiveness and independence are different concepts. Here are two examples clarifying the differences between these concepts:
* Suppose we play a [[SOCR_EduMaterials_Activities_CardExperiment | card game]] of ''guessing the color of a randomly drawn card'' from a [[AP_Statistics_Curriculum_2007_Prob_Simul#Poker_Game |standard 52-card deck]]. As there are 2 possible colors (black and red), and given no other information, the chance for correctly guessing the color (e.g., black) is 0.5. However, additional information may or may not be helpful in identifying the card color. For example:
** If we know that the [[AP_Statistics_Curriculum_2007_Prob_Count#Hands-on_combination_activity |card denomination]] is a king, there are 2 red and 2 black kings, this does '''not''' help us improve our chances of successfully identifying the correct color of the card, P(Red|King)=P(Red), independence.
** If we know that the suit of the card is hearts, this does help is with correctly identifying the card color (as hearts are red), P(Red|Hearts)=1.0, strong dependence.
** Notes:
:: In both cases, the events A={Red} and B={King} and C={Hearts} are '''not''' mutually exclusive (disjoint)!
:: Events that are mutually exclusive (disjoint) cannot be independent!
* [http://en.wikipedia.org/wiki/Color_blindness Color blindness] is a sex-linked trait, as many of the genes involved in color vision are on the [http://en.wikipedia.org/wiki/X_chromosome X chromosome]. Color blindness more common in males than in females, as men do not have a second X chromosome to overwrite the chromosome which carries the mutation. If 8% of variants of a given gene are defective (mutated), the probability of a single copy being defective is 8%, but the probability that two (independent) copies are both defective is 0.08 × 0.08 = 0.0064.
: The events A={Female} and B={Color blind} are not mutually exclusive (females can be color blind), nor they are independent (the rate of color blindness among females is lower). Color blindness prevalence within the 2 genders is P(CB|Male) = 0.08, and P(CB|Female)=0.005, where CB={color blind, one color, a color combination, or another mutation}.

===Example===
Suppose a Laboratory blood test is used as evidence for a disease. Assume P(positive Test| Disease) = 0.95, P(positive Test| no Disease)=0.01 and P(Disease) = 0.005. Find P(Disease|positive Test)=?

Denote D = {the test person has the disease}, <math>D^c</math> = {the test person does not have the disease} and T = {the test result is positive}. Then
<center><math>P(D | T) = {P(T | D) P(D) \over P(T)} = {P(T | D) P(D) \over P(T|D)P(D) + P(T|D^c)P(D^c)}=</math>
<math>={0.95\times 0.005 \over {0.95\times 0.005 +0.01\times 0.995}}=0.193.</math></center>

===See also===
* [[AP_Statistics_Curriculum_2007_Bayesian_Prelim | Bayesian Chapter]]

===[[EBook_Problems_Prob_Rules|Problems]]===

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture03.pdf Probability Lecture Notes (PDF)]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_Prob_Rules}}

AP Statistics Curriculum 2007 EDA Var

2010-05-14T00:00:02Z

ChiccoChou: /* Quartiles and IQR */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Measures of Variation==

===Measures of Variation and Dispersion===
There are many measures of (population or sample) variation, e.g., the range, the variance, the standard deviation, mean absolute deviation, etc. These are used to assess the dispersion or spread of the population.

Suppose we are interested in the long-jump performance of some students. We can carry an experiment by randomly selecting 8 male statistics students and ask them to perform the standing long jump. In reality every student participated, but for the ease of calculations below we will focus on these eight students. The long jumps were as follows:
<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|+Long-Jump (inches) Sample Data
|-
| 60 || 64 || 68 || 74 || 76 || 78 || 80 || 106
|}
</center>

===Range===
The range is the easiest measure of dispersion to calculate, yet, perhaps not the best measure. The '''Range = max - min'''. For example, for the Long Jump data, the range is calculated by:
<center>Range = 106 – 60 = 46.</center>

Note that the range is only sensitive to the extreme values of a sample and ignores all other information. So, two completely different distributions may have the same range.

===Quartiles and IQR===
The first quartile (<math>Q_1</math>) and the third quartile (<math>Q_3</math>) are defined values that split the dataset into ''bottom-25% vs. top-75%'' and ''bottom-75% vs. top-25%'', respectively. Thus the inter-quartile range (IQR), which is the difference <math>Q_3 - Q_1</math>, represents the central 50% for the data and can be considered as a measure of data dispersion or variation. The wider the IQR, the more variant the data.

For example, <math>Q_1=(64+68)/2=66</math>, <math>Q_3=(78+80)/2=79</math> and <math>IQR=Q_3-Q_1=13</math>, for the Long-Jump data shown above. Thus we expect the middle half of all long jumps (for that population) to be between 66 and 79 inches.

===Five-number summary===
The five-number summary for a dataset is the 5-tuple <math>\{min, Q_1, Q_2, Q_3, max\}</math>, containing the sample minimum, first-quartile, second-quartile (median), third-quartile, and maximum.

===Variance and Standard Deviation===
The logic behind the variance and standard deviation measures is to measure the difference between each observation and the mean (i.e., dispersion). Suppose we have ''n > 1'' observations, <math>\left \{ y_1, y_2, y_3, ..., y_n \right \}</math>. The deviation of the <math>i^{th}</math> measurement, <math>y_i</math>, from the mean (<math>\overline{y}</math>) is defined by <math>(y_i - \overline{y})</math>.

Does the average of these deviations seem like a reasonable way to find an average deviation for the sample or the population? No, because the sum of all deviations is trivial:
<center><math>\sum_{i=1}^n{(y_i - \overline{y})}=0.</math></center>

To solve this problem we employ different versions of the '''mean absolute deviation''':
<center><math>{1 \over n-1}\sum_{i=1}^n{|y_i - \overline{y}|}.</math></center>

In particular, the '''variance''' is defined as:
<center><math>{1 \over n-1}\sum_{i=1}^n{(y_i - \overline{y})^2}.</math></center>

And the '''standard deviation''' is defined as:
<center><math>\sqrt{{1 \over n-1}\sum_{i=1}^n{(y_i - \overline{y})^2}}.</math></center>

For the long-jump sample of 8 measurements, the standard deviation is:
<center><math>\sqrt{{1 \over 8-1} \left \{(60-75.75)^2 + (64-75.75)^2 + (68-75.75)^2 + (74-75.75)^2 + (76-75.75)^2 + (78-75.75)^2 + (80-75.75)^2 + (106-75.75)^2 \right \} } = 14.079.</math></center>

===Activities===
Try to pair each of the 4 samples whose numerical summaries are reported below with one of the 4 frequency plots below. Explain your answers.
{| class="wikitable" style="text-align:center; width:75%" border="1"
|+Long-Jump (inches) Sample Data
|-
| Sample || Mean || Median || StdDev
|-
| A || 4.688 || 5.000 || 1.493
|-
| B || 4.000 || 4.000 || 1.633
|-
| C || 3.933 || 4.000 || 1.387
|-
| D || 4.000 || 4.000 || 2.075
|}

<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig10.jpg|500px]]</center>

===Notes===
*Some software packages may use <math>{1 \over n}</math>, instead of the <math>{1 \over n-1}</math>, which we used above. Note that for large sample-sizes this difference becomes increasingly smaller. Also, there are theoretical properties of the sample variance, as defined above (e.g., sample-variance is an unbiased estimate of the population-variance!)

*Most of the [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] and [http://socr.ucla.edu/htmls/SOCR_Analyses.html SOCR Analyses] compute the variance or standard deviation for the sample. You can see these examples of [[SOCR_EduMaterials_ChartsActivities | Charts Activities]] and [[SOCR_EduMaterials_AnalysesActivities | Analyses Activities]] and you can test these using [[SOCR_012708_ID_Data_HotDogs | hotdogs dataset]].

===[[EBook_Problems_EDA_Var | Problems]]===

<hr>

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture02.pdf Lecture notes on EDA]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_EDA_Var}}

AP Statistics Curriculum 2007 EDA Var

2010-05-13T23:58:12Z

ChiccoChou: /* Quartiles and IQR */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Measures of Variation==

===Measures of Variation and Dispersion===
There are many measures of (population or sample) variation, e.g., the range, the variance, the standard deviation, mean absolute deviation, etc. These are used to assess the dispersion or spread of the population.

Suppose we are interested in the long-jump performance of some students. We can carry an experiment by randomly selecting 8 male statistics students and ask them to perform the standing long jump. In reality every student participated, but for the ease of calculations below we will focus on these eight students. The long jumps were as follows:
<center>
{| class="wikitable" style="text-align:center; width:75%" border="1"
|+Long-Jump (inches) Sample Data
|-
| 60 || 64 || 68 || 74 || 76 || 78 || 80 || 106
|}
</center>

===Range===
The range is the easiest measure of dispersion to calculate, yet, perhaps not the best measure. The '''Range = max - min'''. For example, for the Long Jump data, the range is calculated by:
<center>Range = 106 – 60 = 46.</center>

Note that the range is only sensitive to the extreme values of a sample and ignores all other information. So, two completely different distributions may have the same range.

===Quartiles and IQR===
The first quartile (<math>Q_1</math>) and the third quartile (<math>Q_3</math>) are defined values that split the dataset into ''bottom-25% vs. top-75%'' and ''bottom-75% vs. top-25%'', respectively. This the inter-quartile range (IQR), which is the difference <math>Q_3 - Q_1</math>, represents the central 50% for the data and can be considered as a measure of data dispersion or variation. The wider the IQR, the more variant the data.

For example, <math>Q_1=(64+68)/2=66</math>, <math>Q_3=(78+80)/2=79</math> and <math>IQR=Q_3-Q_1=13</math>, for the Long-Jump data shown above. Thus we expect the middle half of all long jumps (for that population) to be between 66 and 79 inches.

===Five-number summary===
The five-number summary for a dataset is the 5-tuple <math>\{min, Q_1, Q_2, Q_3, max\}</math>, containing the sample minimum, first-quartile, second-quartile (median), third-quartile, and maximum.

===Variance and Standard Deviation===
The logic behind the variance and standard deviation measures is to measure the difference between each observation and the mean (i.e., dispersion). Suppose we have ''n > 1'' observations, <math>\left \{ y_1, y_2, y_3, ..., y_n \right \}</math>. The deviation of the <math>i^{th}</math> measurement, <math>y_i</math>, from the mean (<math>\overline{y}</math>) is defined by <math>(y_i - \overline{y})</math>.

Does the average of these deviations seem like a reasonable way to find an average deviation for the sample or the population? No, because the sum of all deviations is trivial:
<center><math>\sum_{i=1}^n{(y_i - \overline{y})}=0.</math></center>

To solve this problem we employ different versions of the '''mean absolute deviation''':
<center><math>{1 \over n-1}\sum_{i=1}^n{|y_i - \overline{y}|}.</math></center>

In particular, the '''variance''' is defined as:
<center><math>{1 \over n-1}\sum_{i=1}^n{(y_i - \overline{y})^2}.</math></center>

And the '''standard deviation''' is defined as:
<center><math>\sqrt{{1 \over n-1}\sum_{i=1}^n{(y_i - \overline{y})^2}}.</math></center>

For the long-jump sample of 8 measurements, the standard deviation is:
<center><math>\sqrt{{1 \over 8-1} \left \{(60-75.75)^2 + (64-75.75)^2 + (68-75.75)^2 + (74-75.75)^2 + (76-75.75)^2 + (78-75.75)^2 + (80-75.75)^2 + (106-75.75)^2 \right \} } = 14.079.</math></center>

===Activities===
Try to pair each of the 4 samples whose numerical summaries are reported below with one of the 4 frequency plots below. Explain your answers.
{| class="wikitable" style="text-align:center; width:75%" border="1"
|+Long-Jump (inches) Sample Data
|-
| Sample || Mean || Median || StdDev
|-
| A || 4.688 || 5.000 || 1.493
|-
| B || 4.000 || 4.000 || 1.633
|-
| C || 3.933 || 4.000 || 1.387
|-
| D || 4.000 || 4.000 || 2.075
|}

<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig10.jpg|500px]]</center>

===Notes===
*Some software packages may use <math>{1 \over n}</math>, instead of the <math>{1 \over n-1}</math>, which we used above. Note that for large sample-sizes this difference becomes increasingly smaller. Also, there are theoretical properties of the sample variance, as defined above (e.g., sample-variance is an unbiased estimate of the population-variance!)

*Most of the [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] and [http://socr.ucla.edu/htmls/SOCR_Analyses.html SOCR Analyses] compute the variance or standard deviation for the sample. You can see these examples of [[SOCR_EduMaterials_ChartsActivities | Charts Activities]] and [[SOCR_EduMaterials_AnalysesActivities | Analyses Activities]] and you can test these using [[SOCR_012708_ID_Data_HotDogs | hotdogs dataset]].

===[[EBook_Problems_EDA_Var | Problems]]===

<hr>

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture02.pdf Lecture notes on EDA]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_EDA_Var}}

AP Statistics Curriculum 2007 EDA Freq

2010-05-13T23:17:12Z

ChiccoChou: /* Definitions */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Summarizing data with Frequency Tables==

===Summarizing data with Frequency Tables & Histograms===
There are two ways to describe a data set (sample from a population) - Pictorial Graphs or Tables of Numbers. Both are important for analyzing data.

===Definitions===
* A '''frequency distribution''' is a display of the number (frequency) of occurrences of each value in a data set.
* A '''relative frequency''' distribution is a display of the percent (ratio or frequency to sample-size) of occurrences of each value in a data set.
* A [http://en.wikipedia.org/wiki/Percentile percentile] is the value of a variable that divides the real line into two segments - the left one containing certain percent (say 13%) of the observations for the specific process, and the right interval containing the complementary percentage of observations (in this case 87%). The 30th percentile is the value (measurement) bound above 30% and below 70% of the observations from a process.
* The (three) '''quartiles''' are the special cases of percentiles for Q1=25%, Q2=50% (median) and Q3=75%.

===Example===
The table below shows the stage of disease at diagnosis of breast cancer in a random sample of 2092 US women.

<center>
{| class="wikitable"
|-
! Stage
! Frequency
! Relative Frequency
|-
| 0
| 197
| 0.09
|-
| I
| 691
| 0.33
|-
| II
| 703
| 0.34
|-
| III
| 314
| 0.15
|-
| IV
| 187
| 0.09
|-
| Total
| 2092
| 1.00
|}
</center>

===Computational Resources: Internet-based SOCR Tools===
* [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] allows you to generate graphical representations (including frequency histograms) of a variety of datasets.
* The [[SOCR_EduMaterials_ChartsActivities | SOCR Charts activities]] provide usage-instructions, examples and demonstrations of how to use SOCR Charts.

===Hands-on activities===
You can copy and paste the first 2 columns in the data table above in the [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] (BarChart --> XYPlot --> HistogramDemo7). You can see [[SOCR_EduMaterials_Activities_Histogram_Graphs | this SOCR Charts activity]] for help with histogram plots.
* The graph below illustrates the (raw) frequency histogram (using counts)
<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig1.jpg|500px]]</center>

* The graph below show the relative frequency histogram (using the last column of the table above).

<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig2.jpg|500px]]</center>

===[[EBook_Problems_EDA_Freq | Problems]]===

<hr>

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture02.pdf Lecture notes on EDA]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_EDA_Freq}}

AP Statistics Curriculum 2007 EDA Freq

2010-05-13T23:15:45Z

ChiccoChou: /* Definitions */

==[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Summarizing data with Frequency Tables==

===Summarizing data with Frequency Tables & Histograms===
There are two ways to describe a data set (sample from a population) - Pictorial Graphs or Tables of Numbers. Both are important for analyzing data.

===Definitions===
* A '''frequency distribution''' is a display of the number (frequency) of occurrences of each value in a data set.
* A '''relative frequency''' distribution is a display of the percent (ratio or frequency to sample-size) of occurrences of each value in a data set.
* A [http://en.wikipedia.org/wiki/Percentile percentile] is the value of a variable that devides the real line into two segments - the left one containing certain percent (say 13%) of the observations for the specific process, and the righ interval containing the complement peecent of observations (in this case 87%). The 30th percentile is the value (measurement) that abounds above 30% and below 70% of the observations from a process.
* The (three) '''quartiles''' are the special cases of percentiles for Q1=25%, Q2=50% (median) and Q3=75%.

===Example===
The table below shows the stage of disease at diagnosis of breast cancer in a random sample of 2092 US women.

<center>
{| class="wikitable"
|-
! Stage
! Frequency
! Relative Frequency
|-
| 0
| 197
| 0.09
|-
| I
| 691
| 0.33
|-
| II
| 703
| 0.34
|-
| III
| 314
| 0.15
|-
| IV
| 187
| 0.09
|-
| Total
| 2092
| 1.00
|}
</center>

===Computational Resources: Internet-based SOCR Tools===
* [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] allows you to generate graphical representations (including frequency histograms) of a variety of datasets.
* The [[SOCR_EduMaterials_ChartsActivities | SOCR Charts activities]] provide usage-instructions, examples and demonstrations of how to use SOCR Charts.

===Hands-on activities===
You can copy and paste the first 2 columns in the data table above in the [http://socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts] (BarChart --> XYPlot --> HistogramDemo7). You can see [[SOCR_EduMaterials_Activities_Histogram_Graphs | this SOCR Charts activity]] for help with histogram plots.
* The graph below illustrates the (raw) frequency histogram (using counts)
<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig1.jpg|500px]]</center>

* The graph below show the relative frequency histogram (using the last column of the table above).

<center>[[Image:SOCR_EBook_Dinov_EDA_012708_Fig2.jpg|500px]]</center>

===[[EBook_Problems_EDA_Freq | Problems]]===

<hr>

===References===
* [http://www.stat.ucla.edu/%7Edinov/courses_students.dir/07/Fall/STAT13.1.dir/STAT13_notes.dir/lecture02.pdf Lecture notes on EDA]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_EDA_Freq}}

AP Statistics Curriculum 2007 IntroUses

2010-05-13T23:00:19Z

ChiccoChou: /* Loaded Questions in Surveys or Polls */

[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Uses and Abuses of Statistics

==Uses and Abuses of Statistics==
Statistics is the science of variation, randomness and chance. As such, statistics is different from the [http://en.wikipedia.org/wiki/Isaac_Newton Newtonian sciences], where the processes being studied obey exact deterministic mathematical laws and typically can be described as [http://en.wikipedia.org/wiki/Category:Equations systems]. Because statistics provides tools for data understanding where no other science can, one should be prepared to trade this new power of knowledge with uncertainty. In general, statistical analysis, inference and simulation will not provide deterministic answers and strict (e.g., yes/no, presence/absence) responses to questions involving stochastic processes. Rather, statistics will provide quantitative inference represented as long-time probability values, confidence or prediction intervals, odds, chances, etc., which may ultimately be subjected to varying interpretations.

This possibility of multiple interpretations may be viewed by some as detrimental or inconsistent. But others consider these outcomes as beautiful, scientific and elegant responses to challenging problems that are inherently stochastic. The phrase ''Uses and Abuses of Statistics'' refers to this notion that in some cases statistical results may be used as evidence to seemingly opposite theses. However, most of the time, common [http://en.wikipedia.org/wiki/Logic principles of logic] allow us to disambiguate the obtained statistical inference. [[AP_Statistics_Curriculum_2007_IntroUses#References | Some appropriate probability and statistics quotes are provided in the references section]].

==Approach==
When presented with a problem, data and statistical inference about a phenomenon, one needs to critically assess the validity of the assumptions, accuracy of the models and correctness of the interpretation of the thesis. There are many so called paradoxes, where one can easily be convinced of an erroneous conclusion, because the underlying principles are violated (e.g., [http://en.wikipedia.org/wiki/Simpson_paradox Simpson's paradox], the [http://en.wikipedia.org/wiki/Birthday_paradox Birthday paradox], etc.). Critical evaluation of the design of the experiment, data collection, measurements and validity of the analysis strategy should lead to correct inference and interpretation in most cases.

In summary, one must:
* be presented with a problem
* critically analyze the given information
* design an experiment to collect data
* analyze the collection
* evaluate the experiment
* validate the inferences and interpretations made

==Examples of Common Causes for Data Misinterpretation==
===Unrepresentative Samples===
These are collections of data measurement or observations that do not adequately describe the natural process or phenomenon being studied. The phrase ''garbage-in, garbage-out'' refers to this situation and implies that none of the conclusions or the inference based on such unrepresentative samples should be trusted. In general, collecting a population representative sample is a hard experimental design problem.
* '''Self-Selection''' - voluntary response samples, where the respondents, units or participants decide themselves whether to be included in the sample, survey or experiment.
* ''Non-Sampling Errors'' (e.g., non-response bias) are errors in the data collection that are not due to the process of sampling or the study design.

===Sampling Errors===
Sampling errors arise from a decision to use a sample rather than measure the entire population.

===Samples of Small Sizes===
Small sample sizes may significantly distort the interpretation of the data, or results, because a small-sample data [[EBook#Chapter_II:_Describing.2C_Exploring.2C_and_Comparing_Data | distribution may have completely different characteristics]] from these of the native population the sample is drawn from (e.g., center, spread, shape, etc.) For example, use the [[SOCR_EduMaterials_Activities_GeneralCentralLimitTheorem | SOCR CLT activity]] to sample small samples from varieties of distributions and compare the sample-histogram against the population distribution. Their characteristics will be mostly similar, but sometimes they will be drastically different.

===Loaded Questions in Surveys or Polls===
The phrasing of questions, their intonation and emphasis may significantly affect the perception of the question (intentionally or unintentionally).

===Misleading Graphs===
Look at the quantitative information represented in a chart or plot, not at the shape, orientation, relation or pattern represented by the graph.
* Partial Pictures
* Deliberate Distortions
* Scale breaks and axes scaling

===Inappropriate estimates or statistics===
Erroneous population parameter estimates (intentionally or most likely unintentionally) may affect data collections. The source of the data and the method for parameter estimation should be carefully reviewed to avoid bias and misinterpretation of data, results and to guarantee robust inference.

==Computational Resources: Internet-based SOCR Tools==
* [http://www.socr.ucla.edu/htmls/SOCR_Experiments.html SOCR Experiments]
* [http://www.socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts]

==Examples & Hands-on Activities==
* [[SOCR_EduMaterials_Activities_BirthdayExperiment | Birthday Paradox I]]
* [[SOCR_EduMaterials_Activities_Birthday | Birthday Paradox II]]

==[[EBook_Problems_EDA_IntroUses|Problems]]==

<hr>
==References==
* [[SOCR_Quotes | SOCR Quotes]]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_IntroUses}}

AP Statistics Curriculum 2007 IntroUses

2010-05-13T22:55:41Z

ChiccoChou:

[[AP_Statistics_Curriculum_2007 | General Advance-Placement (AP) Statistics Curriculum]] - Uses and Abuses of Statistics

==Uses and Abuses of Statistics==
Statistics is the science of variation, randomness and chance. As such, statistics is different from the [http://en.wikipedia.org/wiki/Isaac_Newton Newtonian sciences], where the processes being studied obey exact deterministic mathematical laws and typically can be described as [http://en.wikipedia.org/wiki/Category:Equations systems]. Because statistics provides tools for data understanding where no other science can, one should be prepared to trade this new power of knowledge with uncertainty. In general, statistical analysis, inference and simulation will not provide deterministic answers and strict (e.g., yes/no, presence/absence) responses to questions involving stochastic processes. Rather, statistics will provide quantitative inference represented as long-time probability values, confidence or prediction intervals, odds, chances, etc., which may ultimately be subjected to varying interpretations.

This possibility of multiple interpretations may be viewed by some as detrimental or inconsistent. But others consider these outcomes as beautiful, scientific and elegant responses to challenging problems that are inherently stochastic. The phrase ''Uses and Abuses of Statistics'' refers to this notion that in some cases statistical results may be used as evidence to seemingly opposite theses. However, most of the time, common [http://en.wikipedia.org/wiki/Logic principles of logic] allow us to disambiguate the obtained statistical inference. [[AP_Statistics_Curriculum_2007_IntroUses#References | Some appropriate probability and statistics quotes are provided in the references section]].

==Approach==
When presented with a problem, data and statistical inference about a phenomenon, one needs to critically assess the validity of the assumptions, accuracy of the models and correctness of the interpretation of the thesis. There are many so called paradoxes, where one can easily be convinced of an erroneous conclusion, because the underlying principles are violated (e.g., [http://en.wikipedia.org/wiki/Simpson_paradox Simpson's paradox], the [http://en.wikipedia.org/wiki/Birthday_paradox Birthday paradox], etc.). Critical evaluation of the design of the experiment, data collection, measurements and validity of the analysis strategy should lead to correct inference and interpretation in most cases.

In summary, one must:
* be presented with a problem
* critically analyze the given information
* design an experiment to collect data
* analyze the collection
* evaluate the experiment
* validate the inferences and interpretations made

==Examples of Common Causes for Data Misinterpretation==
===Unrepresentative Samples===
These are collections of data measurement or observations that do not adequately describe the natural process or phenomenon being studied. The phrase ''garbage-in, garbage-out'' refers to this situation and implies that none of the conclusions or the inference based on such unrepresentative samples should be trusted. In general, collecting a population representative sample is a hard experimental design problem.
* '''Self-Selection''' - voluntary response samples, where the respondents, units or participants decide themselves whether to be included in the sample, survey or experiment.
* ''Non-Sampling Errors'' (e.g., non-response bias) are errors in the data collection that are not due to the process of sampling or the study design.

===Sampling Errors===
Sampling errors arise from a decision to use a sample rather than measure the entire population.

===Samples of Small Sizes===
Small sample sizes may significantly distort the interpretation of the data, or results, because a small-sample data [[EBook#Chapter_II:_Describing.2C_Exploring.2C_and_Comparing_Data | distribution may have completely different characteristics]] from these of the native population the sample is drawn from (e.g., center, spread, shape, etc.) For example, use the [[SOCR_EduMaterials_Activities_GeneralCentralLimitTheorem | SOCR CLT activity]] to sample small samples from varieties of distributions and compare the sample-histogram against the population distribution. Their characteristics will be mostly similar, but sometimes they will be drastically different.

===Loaded Questions in Surveys or Polls===
The phrasing of questions, their intonation and emphasis may significantly effect the perception of the question (intentionally or unintentionally).

===Misleading Graphs===
Look at the quantitative information represented in a chart or plot, not at the shape, orientation, relation or pattern represented by the graph.
* Partial Pictures
* Deliberate Distortions
* Scale breaks and axes scaling

===Inappropriate estimates or statistics===
Erroneous population parameter estimates (intentionally or most likely unintentionally) may affect data collections. The source of the data and the method for parameter estimation should be carefully reviewed to avoid bias and misinterpretation of data, results and to guarantee robust inference.

==Computational Resources: Internet-based SOCR Tools==
* [http://www.socr.ucla.edu/htmls/SOCR_Experiments.html SOCR Experiments]
* [http://www.socr.ucla.edu/htmls/SOCR_Charts.html SOCR Charts]

==Examples & Hands-on Activities==
* [[SOCR_EduMaterials_Activities_BirthdayExperiment | Birthday Paradox I]]
* [[SOCR_EduMaterials_Activities_Birthday | Birthday Paradox II]]

==[[EBook_Problems_EDA_IntroUses|Problems]]==

<hr>
==References==
* [[SOCR_Quotes | SOCR Quotes]]

<hr>
* SOCR Home page: http://www.socr.ucla.edu

{{translate|pageName=http://wiki.stat.ucla.edu/socr/index.php?title=AP_Statistics_Curriculum_2007_IntroUses}}