# EBook Problems EDA IntroDesign

## EBook Problems Set - Design of Experiments Problems

### Problem 1

Doctors at the UCLA Hospital are worried about some of the side effects of a drug used to treat cancer when that drug is prescribed in large amounts. 60 volunteers are randomly split into three groups of 20; the first group doesn't take the drug, the second group takes a low dosage of the drug, and the third group takes a high dosage of the drug. How many treatments are there in this experiment?

(a) There are 60 treatments, one for each volunteer.
(b) There is only one treatment used for this cancer, the drug being tested.
(c) There were 180 treatments, one for each level of the drug and one for each patient.
(d) There are three treatments, one for each level of the drug.
(e) We need to know what the dosages prescribed were in order to determine the treatments

### Problem 2

Suppose two researchers wanted to determine if aspirin reduced the chance of a heart attack. Researcher 1 studied the medical records of 500 patients. For each patient, he recorded whether the person took aspirin every day and if the person had ever had a heart attack. Then he reported the percentage of heart attacks for the patients who took aspirin every day and for those who did not take aspirin every day.

Researcher 2 also studied 500 people. He randomly assigned half of the patients to take aspirin every day and the other half to take a placebo everyday. After a certain length of time, he reported the percentage of heart attacks for the patients who took aspirin every day and for those who did not take aspirin every day. Suppose that both researchers found that there is a statistically significant difference in the heart attack rates for the aspirin users and the non-aspirin users and that aspirin users had a lower rate of heart attacks. Can both researchers conclude that aspirin caused the reduction?

(a) No, only researcher 2 can conclude this.
(b) No, only researcher 1 can conclude this.
(c) Yes, because aspirin is known to reduce heart attacks.
(d) Yes, because aspirin users had a larger heart attack rate in both studies.

### Problem 3

Suppose that you were hired as a statistical consultant to design a study to examine the impact of a new medicine vs. a current medicine on lowering blood pressure. 50 patients volunteer to participate in the study. What design will you recommend?

(a) Completely randomized design with two factors.
(b) Completely randomized design with two factors and single blind.
(c) Completely randomized design.
(d) Completely randomized design with two factors and double blind.

### Problem 4

Hospital floors are usually covered by bare tiles. Carpets would cut down on noise but might be more likely to harbor germs. To study this possibility, investigators randomly assigned 8 of 16 available hospital rooms to have carpet installed. The others were left bare. Later, air from each room was pumped over a dish of agar. The dish was incubated for a fixed period, and the number of bacteria colonies were counted. Select the appropriate statistical term for the 8 rooms left bare.

(a) Treatments
(b) Experimental Units
(c) Control Group
(d) Response

### Problem 5

What conditions would need to be satisfied in order to say that a change in the variable X causes a change in the variable Y?

(a) When an experiment reveals that a change in X causes a change in Y.
(b) When possible confounding variables have been ruled out.
(c) When the correlation between X and Y is close to 1 or -1.

### Problem 6

Hospital floors are usually covered by bare tiles. Carpets would cut down on noise but might be more likely to harbor germs. To study this possibility, investigators randomly assigned 8 of 16 available hospital rooms to have carpet installed. The others were left bare. Later, air from each room was pumped over a dish of agar. The dish was incubated for a fixed period, and the number of bacteria colonies were counted. Select the appropriate statistical term for the 16 hospital rooms.

(a) Response
(b) Treatments
(c) Experimental Units
(d) Control Group

### Problem 7

Hospital floors are usually covered by bare tiles. Carpets would cut down on noise but might be more likely to harbor germs. To study this possibility, investigators randomly assigned 8 of 16 available hospital rooms to have carpet installed. The others were left bare. Later, air from each room was pumped over a dish of agar. The dish was incubated for a fixed period, and the number of bacteria colonies were counted. Select the appropriate statistical term for number of colonies in a dish.

(a) Treatments
(b) Control Group
(c) Response
(d) Experimental Units

### Problem 8

Hospital floors are usually covered by bare tiles. Carpets would cut down on noise but might be more likely to harbor germs. To study this possibility, investigators randomly assigned 8 of 16 available hospital rooms to have carpet installed. The others were left bare. Later, air from each room was pumped over a dish of agar. The dish was incubated for a fixed period, and the number of bacteria colonies were counted. Select the appropriate statistical term for number of colonies in a dish.

(a) Treatments
(b) Response
(c) Experimental Units
(d) Control Group

### Problem 9

Suppose that students A and B are working for the UCLA registrar. The registrar asks student A to calculate the mean and SD of the GPA's for the Fall 2005 freshmen class. He asks student B to design a sampling strategy to evaluate the attitude of the undergraduates at UCLA toward undergraduate teaching.

(a) Student A is doing descriptive statistics and student B is doing inferential statistics.
(b) Student A is doing inferential statistics and student B is doing descriptive statistics.
(c) Both students are doing descriptive statistics.
(d) Both students are doing inferential statistics.

### Problem 10

A psychologist is examining the effect of showing pictures on learning of words by seven-year-olds. The seven-year-olds are randomly assigned to two groups. The experimental group is shown the word along with the picture. The control group is shown only the word. At the end of the experiment, the subjects are given a test on the number of words they get right. This is an example of:

(a) A blind study
(b) An experiment with a design flaw
(c) A double blind study
(d) A well-designed experiment

### Problem 11

We want to examine the effectiveness of three programs on the weight loss of men and women in the 40-50 year old age range. 150 men and 150 women participate in the study. Subjects are randomly assigned to the three programs. They spend 3-4 hours in the program per week and they continue the program for six months. Their weight is recorded before and after the program.

(a) This is an observational study because the subjects may spend less time in the program than they are supposed to
(b) This is an experiment because the researcher is measuring the weight of the subjects before and after participation in the program
(c) This is an experimental study because the subjects have been randomly assigned to different treatment groups
(d) This study is a combination of experimental and observational because we are collecting data on both the experimental and the control group as well as talking to the people.

### Problem 12

Flexible Brains People who grow up left-handed have a different, more flexible brain structure than those born to take life by the right hand, says UCLA researchers who use twins to study heredity. The reason is that right-handers have genes that force their brains into a slightly more one-sided structure. Left-handers appear to be missing those genes.

“There is a real difference in brains that result in a more symmetric brain in left-handers, where the two sides are more equal,” said UCLA neurogeneticist Dr. Daniel Geschwind, who lead the research team. “There is more flexibility and that is under genetic control.” That hereditary difference between right-handers and left-handers also appears to affect how the brain changes in size throughout a lifetime, the researchers found.

Of all the primates, only humans display such a strong predisposition to right-handedness. Right-handers make-up about 90% of the population. The left and right halves of the brain are different in both anatomy and their features, related to hand preference. But until now, no one could document the connection.

To study brain size and structure, the UCLA researchers used brain-scanning technique called functional magnetic resonance to compare brains in 72 pairs of identical twins, all of them male World War II veterans ages 75 to 85. Identical twins-who share the same genes- offer a unique lens through which to study the relative effects of heredity on human nature.

Right-handers typically have a larger left brain hemisphere, where their language abilities are concentrated. Conversely, left-handers have more balances brains, with both sides relatively symmetric. “Overall, this study shows us that brain structure is highly influenced by genetics, even later in life,” Greschwind said. “This implies that aging-related changes to the brain also possess a strong genetic basis. That is kind of wild.”

After to read the article from the Los Angeles Times about Dr. Greschwind’s research, determine the response variables in the study.

(a) Brain-scanning technique called functional magnetic resonance
(c) Brain size and structure
(d) 72 pairs of male twins who were WWII veterans, ages 75 to 85 years
(e) left and right handedness

### Problem 13

Identify the explanatory variables in Dr. Greschwind's study.

(a) brain size and structure
(b) 72 pairs of male twins who were WWII veterans, ages 75 to 85 years
(c) left and right handedness
(d) brain-scanning technique called functional magnetic resonance

### Problem 14

Who were the subjects in Dr. Greschwind's study?

(a) brain size and structure
(b) 72 pairs of male twins who were WWII veterans, ages 75 to 85 years
(c) left and right handedness
(d) brain-scanning technique called functional magnetic resonance

### Problem 15

Determine whether or not Dr. Greschwind's study observational or an experiment and select the best justificatin for your answer.

(a) The study was an experiment since there was a control group
(b) The study was an experiment since treatments were imposed on the subjects
(c) There is not enough information in the article to determine whether or not the study is an experiment or observational
(d) The study was observational since no treatments were imposed on the subjects
(e) The study was observational since no treatments wereimposed on the subjects

### Problem 16

In a study appearing in the Journal Science, a research team reports that plants in Southern England are flowering earlier in the spring. Records of the first flowering dates for 385 species over a period of 47 years show that flowering has advanced an average of 15 days per decade, an indication of climate warming, according to the authors. What is the number of cases in this data set?

(a) 385
(b) 1,000
(c) 47
(d) 18,095

### Problem 17

Scientists are interested in the effects of the sun on growth of moss on trees above the Arctic Circle. 25 years of data is collected and then analyzed. The study shows that the moss grows the most in the years where there is a moderate amount of sun during the summer, and the least in the years where the sun is mostly obscured by clouds during the summer. This is an example of an

(a) experimental study from which we can draw causal conclusions cautiously
(b) experimental study from which we cannot draw causal conclusions
(c) observational study from which we can draw causal conclusions
(d) experimental study from which we can draw causal conclusions
(e) observational study from which we cannot draw causal conclusions

### Problem 18

At the Department of Statistics, we intend to examine the effect of using computers in Statistics 10 on the attitudes of students toward statistics. We offer ten lectures of Statistics 10 in an academic year. Five of these sections are randomly assigned to the experimental group and the other five are assigned to the control group. The experimental group will go to lecture, section, and computer lab. The control group will only go to lecture and section, but will not do the computer lab. The attitude of the students toward statistics is measured before and after the course. This study is:

(a) A double blind study
(b) A well-designed experiment
(c) A blind study
(d) Not a randomized experiment

### Problem 19

An office manager wonders whether there is any relationship between drinking coffee before 10 am and alertness. He selects at random 3 days of the week, and in those days, he compared the alertness level of 25 employees who usually drink coffee before 10 am and 25 employees who do not usually drink coffee before 10 am. Is this an observational or experimental study?

(b) This is an experimental study
(c) This is an observational study
(d) this is a combination of experimental and observational study

### Problem 20

A study indicated that elderly people (age 70 and higher) who had pets lived longer and became less depressed than elderly people who did not have pets. The data came from the records of 700 elderly people who went to a local clinic for treatment. Based on pre-existing medical records, 400 had pets and 300 did not.

(a) This is an observational study because the data are obtained from the pre-exisitng medical records of the patients who refer to the local clinic.
(b) This is a double-blind study because the patients do not know that they are being studied and the person in charge of the analysis does not know the names of the elderly.
(c) This is an experiment since treatments were imposed on the patients
(d) This is an experiment because the elderly people who have pets represent the experimental group and those without pets represent the control group.
(e) This is a blind study because the patients do not know that the hospital is studying the relationship between their pet ownership and whether or not they feel depressed.

### Problem 21

A major car manufacturing company intends to find out if cars get better milage with premium instead of regular unleaded gasoline. They also would like to know if the size of the car has any effect on fuel economy. 96 volunteers who are similar in age, experience and style of drivig participate in the study. The drivers are randomly assigned to the premium and regular groups. The drivers assigned to the premium and regular groups are then randomly assigned to drive a small, medium, or large car. All of the drivers are asked to keep a driving log. What is the design used for this study?

(a) randomized block design
(b) Completely randomized two factor experiment
(c) Completely randomized experiment with one factor
(d) Completely randomized experiment with matching

### Problem 22

An examination of the medical records of more than 250,000 women in the 20-40 year age range indicated that those who were overweight had longer than average labor when their first child was born. The study concluded that above average weight causes women in the 20-40 year age range to have longer labor when delivering their first child.

(a) In order to decide this conclusion is right or not, we need to have the scatterplot and the coefficient of correlation for the weight of the mothers and the hours of labor, we could make a decision.
(b) Given the large sample size used in the study they reached the right conclusion and they can generalize this to the overall population.
(c) This conclusion is not correct because there are many other factors other than weight that could contribute to long labor.
(d) They cannot draw such a conclusion because they did not randomly assign the subjects to the control and the experimental group.

### Problem 23

For this research situation, decide what statistical procedure would most likely be used to answer the research question posed. Assume all assumptions have been met for using the procedure.

Is ethnicity related to political party affiliation (Republican, Democrat, Other)?

(a) Test the difference in means between two paired or dependent samples.
(b) Use a chi-squared test of association.
(c) Test one mean against a hypothesized constant.
(d) Test the difference between two means (independent samples).
(e) Test for a difference in more than two means (one way ANOVA).

### Problem 23

We are interested in seeing is support for a school bond issue differs by neighborhood in a city. What statistical method should we use to answer our question?

(a) Test the difference in means between two paired or dependent samples.
(b) Use a chi-squared test of association.
(c) Test one mean against a hypothesized constant.
(d) Test the difference between two means (independent samples).
(e) Test for a difference in more than two means (one way ANOVA).

### Problem 24

We are interested in determining if there is a relationship between a person's sociability and cheerfulness. We assume sociability and cheerfulness can be measured by valid and reliable instruments. What test should we use to answer our question?

(a) Test the difference in means between two paired or dependent samples.
(b) Use a chi-squared test of association.
(c) Test one mean against a hypothesized constant.
(d) Test the difference between two means (independent samples).
(e) Test for a difference in more than two means (one way ANOVA).

### Problem 25

A researcher uses a chi-square test to determine if there is a relationship between 2 categorical variables. Which of the following p-values indicates the strongest evidence of such a relationship?

(a) 0.01
(b) 0.10
(c) 0.05
(d) 0.002
(e) 0.006

### Problem 26

For a year, Sandra gathered data every day, measuring the temperature and counting the number of people on Venice Beach. She calculated a regression equation, using the temperature on a given day as the explanatory variable (X) and number of people on Venice Beach as the response variable (Y). Choose if you agree or disagree with this statement: Sandra can assume that the linear regression equation is an appropriate model for predicting the Y values outside her range of X values.

(a) Disagree, although the relationship for the set of data appears linear, it is not reliable to extrapolate with a model beyond the range of a data set.
(b) Agree, if the linear regression equation is found to be fit the values collected, it is safe to assume that the linear relationship continues beyond the values.
(c) Disagree, there is never a true linear relationship beyond the data points because the relationship always curves or plateaus at some point beyond the data.

### Problem 27

It is believed that 5% of elementary school children have some kind of ADD (Attention Deficit Disorder). Researchers are hoping to track 60 or more of these students for several years. They decide to test 1500 first graders for this problem. What is the probability that they will find enough subjects for their study?

(a) Cannot be calculated with the given data.
(b) Less than 5%.
(c) Between 70% to 80%.
(d) More than 95%.

### Problem 28

A random variable that gives the number of people, from a sample of 1000 people, who are in favor of the death penalty is called a

(a) Normal random variable
(b) Binomial random variable
(c) Bernoulli random variable
(d) Poisson random variable

### Problem 29

What do we call a phenomenon if individual outcomes are uncertain but there is a regular distribution of outcomes in a large number of repetitions?

(a) biased
(b) independent
(c) confounded
(d) random

### Problem 30

A radio talk show invites listeners to enter a dispute about a proposed salary increase for city council members. The host says, "What annual salary do you think council members should get? Call us with your number." In all, 958 people call. The mean of all the salaries they suggest is \$9,740 per year, and the standard deviation of the responses is \$1,125. Which of the following statements applies to this situation?

(a) Since the sample was self-selected, the results cannot be trusted
(b) Since the sample was not self-selected, the results can be trusted
(c) Since the sample size is large, the results can be trusted
(d) We are 95% confident that the proposed salary for council members is \$9669 to \$9811

### Problem 31

In a large midwestern university with 30 different departments, the university is considering eliminating standardized scores from their admission requirements. The university wants to find out whether the students agree with this plan. They decide to randomly select 100 students from each department, send them a survey, and follow up with a phone call if they do not return the survey within a week. What kind of sampling plan did they use?

(a) Stratified random sampling
(b) Simple random sampling
(c) Cluster sampling
(d) Multi-stage sampling

### Problem 32

Students in a statistics class designed a survey about spending habits and gave it to a random sample of 300 students, of whom 282 responded. The statistics students decided that since there are over 4000 students at the college, the results of the survey may not be valid for drawing conclusions about how all students at the college spend money. Do you agree or disagree with this conclusion, and why?

(a) Disagree. 282 is a large enough number to use for these purposes if the sample of students is random.
(b) Disagree. If the sample is random, the size of the sample does not matter.
(c) Agree. 282 is too small a percentage of 4000 (7%) to allow us to draw conclusions about the population.
(d) Agree. You should have a sample that is at least 50% of the population in order to make inferences.

### Problem 33

On October 20, 1993, the San Francisco Chronicle reported on a survey of top high-school students in the U.S. According to the survey:

Cheating is pervasive. Nearly 90 percent admitted some dishonesty, such as copying someones homework or cheating on an exam. The survey was sent last spring to 5,000 of the nearly 700,000 high achievers included in the 1993 edition of Who is who among American High School students. The results were based on the 1,957 completed surveys that were returned.

Is this survey representative of all teenagers for the reason given below? What is the population represented in this survey?