EBook Problems EDA IntroVar

From SOCR
Revision as of 23:32, 25 October 2009 by IvoDinov (talk | contribs) (EBook Problems Set - The Nature of Data and Variation Problems)
Jump to: navigation, search

EBook Problems Set - The Nature of Data and Variation Problems

Problem 1

Researchers do a study on the number of cars that a person owns. They think that the distribution of their data might be normal, even though the median is much smaller than the mean. They make a p-plot. What does it look like?

  • Choose one answer.
(a) It's not a straight line.
(b) It's a bell curve.
(c) It's a group of points clustered around the middle of the plot.
(d) It's a straight line.


Problem 2

Bicycles arrive at a bike shop in boxes. Before they can be sold, they must be unpacked, assembled, and tuned (lubricated, adjusted,etc). Based on past experience, the shop manager makes the following assumptions about how long this may take: The times for each setup phase are independent The times for each phase follow a Normal curve The means and standard deviations of the times (in minutes) are as shown

Phase Mean SD
Unpacking 3.5 0.7
Assembly 21.8 2.4
Tuning 21.8 2.7

What are the mean and standard deviation for the total bicycle set up time?

  • Choose one answer.
(a) Mean = 100 min, standard deviation = 12 min
(b) Can't be determined with the information given
(c) Mean = 37.6 min, standard deviation = 3.7 min
(d) Mean = 20 min, standard deviation = 13.69 min


Problem 3

Let X be a random variable with mean 80 and standard deviation 12. Find the mean and the variance of the following variable: 2X-100

  • Choose one answer.
(a) Mean = 100, variance = 288
(b) Mean = 60, variance = 12
(c) Mean = 160, variance = 144
(d) Mean = 60, variance = 576


Problem 4

Let X be a random variable with mean 80 and standard deviation 12. Find the mean and the standard deviation of the following variable: X- 20

  • Choose one answer.
(a) Mean = 60, standard deviation = 144
(b) Mean = 60, standard deviation = 12
(c) Mean = 80, standard deviation = 12
(d) Mean = 60, standard deviation = -8


Problem 5

A physician collected data on 1000 patients to examine their heights. A statistician hired to look at the files noticed the typical height was about 60 inches, but found that one height was 720 inches. This is clearly an outlier. The physician is out of town and can't be contacted, but the statistician would like to have some preliminary descriptions of the data to present when the doctor returns. Which of the following best describes how the statistician should handle this outlier?

  • Choose one answer.
(a) The statistician should publish a paper on the emergence of a new race of giants.
(b) The statistician should keep the data point in; each point is too valuable to drop one.
(c) The statistician should drop the observation from the analysis because this is clearly a mistake; the person would be 60 feet tall.
(d) The statistician should analyze the data twice, once with and once without this data point, and then compare how the point affects conclusions.
(e) The statistician should drop the observation from the dataset because we can't analyze the data with it.


Problem 6

What do you expect the distribution of income in a company where fewer than half of the employees make less than the average to look like?

  • Choose one answer
(a) Bimodal
(b) Skewed to the right or positively skewed
(c) Symmetrical
(d) Skewed to the left or negatively skewed


Problem 7

Which of the following parameters is most sensitive to outliers?

  • Choose one answer.
(a) Standard deviation
(b) Interquartile range
(c) Mode
(d) Median


Problem 8

Which value given below is the best representative for the following data?

2, 3, 4, 4, 4, 4, 4, 5, 6, 7, 8, 9, 9, 9, 9, 9, 10, 11

  • Choose one answer.
(a) The weighted average of the two modes or (4*5 + 9*5 )/10 = 6.5
(b) No single number could represent this data set
(c) The average of the two modes or (4 + 9) / 2 = 6.5
(d) The mean or (2 + 3 + 4 + ……+ 10 + 11)/18 = 5.9
(e) The median or (6 + 7)/2 = 6.5


Problem 9

Suppose that the distribution of exam scores has mean = 20.5 and standard deviation = 2.5 and median = 15.0. If you double each score, determine the mean, deviation, and median of the transformed distribution.

  • Choose one answer.
(a) mean = 41.0, deviation = 5.0, median = 30.0
(b) We cannot determine the statistics unless we have the actual data.
(c) mean = 20.5, deviation = 5.0, median = 15.0
(d) mean = 20.5, deviation = 2.5, median = 15.0
(e) mean = 41.0, deviation = 2.5, median = 30.0


Problem 10

A recent housing survey was conducted to determine the price of a typical home in Glendale, CA. Glendale is mostly middle-class, with one very expensive suburb. The mean price of a house was roughly $650,000. Which of the following statements is most likely to be true?

  • Choose one answer.
(a) There are about as many houses in Glendale that cost more than $650,000 than less than this amount.
(b) Most houses in Glendale cost less than $650,000.
(c) Most houses in Glendale cost more than $650,000.
(d) We need to know the standard deviation to answer this question.


Problem 11

UCLA biochemistry major Soo Kyung Lee, who plans to enroll in med school after graduation, collected data from the AAMC (Association of American Medical Colleges) website for 121 schools and included these attributes about each institution: name, public or private institution, state, location either East or West of the Mississippi River, cost of health insurance, resident tuition, resident fees, resident total expenses, nonresident tuition, nonresident fees, and nonresident total expenses in 2005. Soo was surprised that UCLA and other UC medical schools charge no tuition for residents. However, UC students pay about $20,000 in fees.

_ Min Q1 Median Q3 Max
Private $6,550 $$30,729 $33,850 $36,685 $41,360
Public $0 $10,219 $16,168 $18,800 $27,886

On the same scale, use the 5-Number summary to construct two boxplots for the tuition for residents at 73 public and 48 private medical colleges. Use the data and plots to determine which statement about centers is true.

(a) For private medical schools, the mean tuition of residents is greater than the median tuition for residents.
(b) With these data, we cannot determine the relationship between mean and median tuition for residents.
(c) For private medical schools, the mean tuition of residents is equal to the median tuition for residents.
(d) For private medical schools, the mean tuition of residents is less the median tuition for residents.


Problem 12

Determine which of the following statements is true about the spread for Medical School resident tuition.

  • Choose one answer.
(a) There is the same variation for resident tuition for residents at private medical schools and for resident tuition at public medical schools since the ranges are almost equal.
(b) There is the more variation for resident tuition for residents at private medical schools than for resident tuition at public medical schools since there are outliers for private schools.
(c) There is more variation for resident tuition for residents at public medical schools than for resident tuition at private medical schools since the interquartile range is wider for public schools.
(d) With these data, we cannot determine the variation for tuition for residents at private and public medical colleges.


Problem 13

Use the plots and summary statistics to determine which of the statements about outliers is true.

  • Choose one answer.
(a) There are outliers for both distributions
(b) UCLA is an outlier since UCLA does not charge any tuition for residents
(c) There is at least one outlier for the distribution of resident tuition for private medical schools


Problem 14

Suppose that we create a new data set by doubling the highest value in a large data set of positive values. What statement is FALSE about the new data set?

  • Choose one answer.
(a) The mean increases
(b) The standard deviation increases
(c) The range increases
(d) the median and interquartile range both increase


Problem 15

Consider a large data set of positive values and multiply each value by 100. Determine the statement which is true.

  • Choose one answer.
(a) The mean, median, and standard deviation increase
(b) The mean and median increase but the standard deviation is unchanged.
(c) The standard deviation increases but the mean and median are unchanged.
(d) The range and interquartile range are unchanged



Translate this page:

(default)
Uk flag.gif

Deutsch
De flag.gif

Español
Es flag.gif

Français
Fr flag.gif

Italiano
It flag.gif

Português
Pt flag.gif

日本語
Jp flag.gif

България
Bg flag.gif

الامارات العربية المتحدة
Ae flag.gif

Suomi
Fi flag.gif

इस भाषा में
In flag.gif

Norge
No flag.png

한국어
Kr flag.gif

中文
Cn flag.gif

繁体中文
Cn flag.gif

Русский
Ru flag.gif

Nederlands
Nl flag.gif

Ελληνικά
Gr flag.gif

Hrvatska
Hr flag.gif

Česká republika
Cz flag.gif

Danmark
Dk flag.gif

Polska
Pl flag.png

România
Ro flag.png

Sverige
Se flag.gif