Next: Bibliography Up: No Title Previous: Notation and Abbreviations

# Practice Final Examination

Attempt all problems.

1.
Suppose the population of incomes of people working in industry and who have a masters degree in Statistics is positively skewed with mean of $55 (in thousands of dollars) and a standard deviation of 3. Suppose we take a sample of size 100 from this population and form the arithmetic average . If we did this repeatedly, what would be the shape of the histograms of 's and in what interval would the middle 68% of 's lie? (a) Positively skewed and (52, 57). (b) Positively skewed and (54.7, 55.3). (c) Mound shaped and (54.7, 55.3). (d) Mound shaped and (52, 57). 2. In the last problem, from the print out below find the probability that the average income of 16 such people exceeds 57. Rweb:> # CUMULATIVE BINOMIAL DISTRIBUTION Rweb:> pbinom(55, 57, .75) [1] 0.9999985 Rweb:> # BINOMIAL PROBABILITY Rweb:> dbinom(55, 57, .75) [1] 1.340548e-05 Rweb:> # CUMULATIVE NORMAL DISTRIBUTION Rweb:> pnorm(57, 55, .75) [1] 0.9961696  (a) .9962 (b) .0038 (c) .9999 (d) .0001 3. To be accepted into Stanford's Graduate Business School, a candidate must pass a four hour written entrance exam, the SBSE. Ajax Prep Company offers an expensive study course to prepare a person for this exam. Ajax makes the claim that 80% of the students who finish their course pass the SBSE. Pam works for a government agency that thinks Ajax's claim is dubious and that their true percentage of alumni who pass the SBSE is somewhat lower than 80%. So Pam collects a random sample of 64 students who finished Ajax's course and determines that 45 of them passed the SBSE. Based on this information Pam forms a 95% confidence interval for the true percentage and makes a decision. What was Pam's interval and what was her decision? (a) (.65, .76), Ajax's claim is false! (b) (.59, .81), Ajax's claim is false! (c) (.59, .81), no evidence against Ajax. (d) (.65, .76), no evidence against Ajax. 4. Jane works for Dick's Real Estate Agency, Spot Reality. Dick wants to determine the median owner's asking price in an exclusive neighborhood. So Jane obtains the following random sample of owner's asking prices: (In thousands of dollars): 580 552 928 757 84 394 528 373 859 460 258 998  What is Jane's estimate? (a) 461. (b) 564.25. (c) 540. (d) 277.50. 5. In the last problem, Jane was not satisfied with just an estimate, so she used resampling code to obtain the 100 resampled medians: 373.0 383.5 383.5 383.5 383.5 383.5 394.0 394.0 427.0 427.0 427.0 427.0 427.0 460.0 460.0 460.0 460.0 460.0 460.0 461.0 473.0 494.0 494.0 494.0 506.0 506.0 506.0 506.0 506.0 506.0 506.0 506.0 520.0 520.0 528.0 528.0 528.0 540.0 540.0 540.0 540.0 540.0 540.0 540.0 540.0 552.0 552.0 552.0 552.0 552.0 552.0 552.0 552.0 552.0 552.0 552.0 554.0 554.0 554.0 554.0 554.0 554.0 554.0 554.0 566.0 566.0 566.0 566.0 566.0 580.0 580.0 580.0 580.0 580.0 580.0 580.0 642.5 654.5 654.5 668.5 668.5 668.5 668.5 668.5 668.5 693.5 705.5 719.5 719.5 719.5 754.0 757.0 757.0 808.0 808.0 808.0 808.0 859.0 859.0 859.0  From this she obtained a 95% confidence interval. What was Jane's 95% confidence interval and what does it mean? (a) (383.5 ,859.0 ), Jane is fairly confident that this interval contains the true median owner's asking price. (b) ( 383.5 ,859.0 ), Jane is fairly confident that this interval contains the true range of the owner's asking price. (c) (496 ,632 ), Jane is fairly confident that this interval contains the true median owner's asking price. (d) (496 ,632), Jane is fairly confident that this interval contains the true range of the owner's asking price. 6. Consider the last two problems. Suppose Jane took a larger random sample say of size 36 and use it to obtain a new estimate of the true median and a new 95% confidence interval. What is true, in general, about the length of the new confidence interval? (a) The new interval would have about the same length as the old interval. (b) The new interval would have a shorter length than the old interval. (c) Can't say because its another sample. (d) Since the new sample size is larger the new confidence interval would also be larger. 7. Four pea plants of a certain variety are grown without fertilizer, while five of the same variety are grown with fertilizer. Other than the presence or absence of fertilizer the plants received the same treatment. Let be the true mean (or median) increase in plant height due to fertilizer. We want to test the hypotheses The experiment resulted in the following data:  Height (in.) Without Fertilizer(Control): 19 8 16 17 With Fertilizer(Treated): 20 13 25 18 15  Determine the value of the Wilcoxon test statistic for this data and determine what we would expect it to be if H0 is true. (a) 13 (expect it to be 0). (b) 3.2 (expect it to be 0). (c) 13 (expect it to be 10). (d) 3.2 (expect it to be 10). 8. The data were combined into one big sample which was resampled 100 times. In each resampling, 4 were allocated to be new control items and 5 were allocated to be new treated items. For each resampling, the Wilcoxon test statistic was obtained and is given below. Obtain the observed significance level and make the proper decision if your maximum Type I error is at most 5%. (a) .28, conclude that typical fertilized peas are taller than unfertilized peas. (b) .28, no evidence to conclude that typical fertilized peas are taller than unfertilized peas. (c) .56, no evidence to conclude that typical fertilized peas are taller than unfertilized peas. (d) .56, conclude that typical fertilized peas are taller than unfertilized peas. 9. Using the data of Problem 7, Obtain the Wilcoxon estimate of . (a) 1.5. (b) 3.2. (c) 13. (d) 2.5. 10. Besides an estimate of the effect , suppose we also want a confidence interval for . Which resampling plan below would we use. (a) Combine the original samples into one sample and then resample with replacement from the big sample allocating items to new samples. (b) Resample from each sample without replacement. (c) Resample from each sample with replacement. (d) Combine the original samples into one sample and then resample without replacement from the big sample allocating items to new samples. 11. 25 cars were put on test. The first 10 used a standard fuel while the others used a fuel designed (hopefully) to increase miles per gallon. The same amount of fuel was used in each car. Below are the comparison dotplots of the cars' miles per gallon. Standard . :. . . .. . . ---+---------+---------+---------+---------+---------+---C10 Additive . . : . . .. .: . : . ---+---------+---------+---------+---------+---------+---C10 35.0 42.0 49.0 56.0 63.0 70.0  What else if anything needs to be done to correctly'' infer about the new additive? (a) Obtain a point estimate and confidence interval for the effect (difference in means or medians). (b) It is clear from the boxplots that the new fuel additive is effective, so no further statistics are needed. (c) It is clear from the boxplots that the new fuel additive gives similar miles per gallon as the standard, so no further statistics are needed. (d) Obtain a point estimate for the effect (difference in means or medians). 12. A new type of surgery for a certain heart disease has been developed. In order to test it, 50 patients who have the disease were selected. Half of them got the new surgery while the others received the standard surgery. After the surgery, each patient's surgery was rated a success, a failure or no change by a team of doctors who did not know what surgery the patient had received. Suppose we decide to rate the surgeries on their success rates. Let pN be the number of successful surgeries for the new operation and let pS be the number of successful surgeries for the standard operation. Based on the data below estimate pN - pS.  Success Failure No Change New Surgery 16 7 2 Standard Surgery 10 10 5  (a) .64 (b) .24 (c) .40 (d) .195 13. For the last problem, suppose we want to test Using 2000 resamples, we obtain the 95% confidence interval (-.03,.50). Which of the following statements is the proper conclusion for testing H0 versus HA. (a) The sample sizes are far too small to conclude anything. (b) There is sufficient evidence at the .05 level to conclude that the new surgery is better than the standard. (c) There is insufficient evidence at the .05 level to conclude that the new surgery is better than the standard. (d) It is clear from the data that the new surgery is better than the standard, so confidence intervals are not needed. 14. The following data are the monthly rental prices for a random sample of 10 unfurnished studio apartments in the center of a large city. 955, 1000, 985, 980, 940, 975, 965, 999, 1247, 1119  List the 5-number summary (min, Q1, median, Q3, max). (a) 940, 965, 982.5, 1000, 1247 (b) 955, 985, 960, 999, 1119 (c) 940, 955, 982.5, 1119, 1247 (d) 955, 985, 957.5, 999, 1119 15. In order to estimate how much water will be needed to supply the community of Falling Rock in the next decade, the town council asked the city manager to find out how much water typical family uses. A random sample of 15 Falling Rock families used the following amount of water (in thousands of gallons) in the past year. 4.1, 13.1, 14.0, 14.6, 15.5, 16.4, 16.9, 18.2, 18.3, 18.8, 19.7, 21.5, 22.7, 23.8, 32.2  Identify the outliers (if any) in this data set. (a) 4.1, 32.2 (b) 4.1, 13.1, 23.8, 32.2 (c) There aren't any outliers (d) 18.2 The next two questions refer to the following situation: The following side-by-side boxplots represent the prices of gasoline (per gallon) based on a random sample of gasoline stations in Detroit and Chicago. 16. What can be said about the scale (variation) between the Detroit and Chicago gasoline prices? (a) Chicago has less variation (b) Chicago has more variation (c) Detroit has more variation (d) Approximately Equal 17. What can be said about the shift between Detroit and Chicago gasoline prices? (a) Detroit has higher prices (b) The prices are approximately equal (c) Chicago has higher prices (d) Since the variances are different, it is impossible to tell. The next two questions refer to the following situation: An agent for a residential real estate company would like to predict selling prices for homes based upon the size (amount of square footage). A sample of 25 homes in a particular neighborhood was selected. A regression analysis revealed the following scatterplot and regression equation. The regression equation is Price = - 7598 + 105 Square Feet  18. If a home has 1800 square feet of living area, what selling price would this regression model predict? (a)$196,698
(b)
Unable to predict, we would be extrapolating
(c)
$181,402 (d)$189,000

19.
According to this model, for every additional square foot of living area, by how much will the price of a home change?
(a)
Decrease $7598 (b) Increase$7598
(c)
Decrease $105 (d) Increase$105

20.
Your job is to assemble computers. A local company sends you fuses used in the construction of the computers. Your company estimates that 20% of these fuses are defective. You have just received a shipment of 100 fuses from the local company (80 fuses good, 20 fuses defective). You pick 3 fuses at random from this shipment. If your job is to assemble 3 computers, what is the probability you will have 0 defective fuses in your 3 computers. (Hint: Use a tree diagram to calculate)

(a)
.0071
(b)
.5081
(c)
.4919
(d)
.8

21.
A survey of 100 people was taken. The question was: "Please check the appropriate response regarding if you have used the following products over the past month:" The answers from these 100 people are as follows:

                     Event                        Response

Taken Tylenol                           60
Taken Pepto-Bismol                      25
Taken Both Tylenol and Pepto-Bismol     15
Taken Neither Tylenol nor Pepto-Bismol  30

Total                          100

Are the events "Taken Tylenol" and "Taken Pepto-Bismol" independent events?

(a)
Yes
(b)
No
(c)
(d)
Only if this is a random sample of 100 people

The next two questions refer to the following situation:

I wish to estimate the probability of getting a three or more of a kind "3-ones, 3-twos, 3-threes, 3-fours, 3-fives, or 3-sixes" on the first roll of 5 fair dice (Like in the game Yahtzee). I perform 15 resampling trials with the following results:

Trial 1
5	6	2	5	2

Trial 2
4	3	6	6	5

Trial 3
5	6	5	5	1

Trial 4
2	1	4	6	1

Trial 5
5	5	3	2	4

Trial 6
3	4	6	1	4

Trial 7
6	4	1	6	4

Trial 8
1	5	5	6	6

Trial 9
5	3	1	6	1

Trial 10
6	6	6	3	6

Trial 11
4	3	3	2	3

Trial 12
3	5	2	6	4

Trial 13
6	6	3	6	1

Trial 14
3	2	6	5	6

Trial 15
6	1	2	5	6


22.
What is the estimate on the probability of getting three or more of a kind?

(a)
.733
(b)
0
(c)
.267
(d)
.6

23.
What is the error of estimation for this probability?
(a)
0
(b)
.2668
(c)
.2529
(d)
.2285

24.
Consider a metabolic defect that occurs in one of every 100 births. If four infants are born in a particular hospital on a given day, what is the probability that at least one has the defect? Use the following output from the probability module.

Rweb:> # CUMULATIVE BINOMIAL DISTRIBUTION
Rweb:> pbinom(1, 4, .01)
[1] 0.999408
Rweb:> # BINOMIAL PROBABILITY
Rweb:> dbinom(0, 4, .01)
[1] 0.960596

(a)
0.96
(b)
0.04
(c)
0.999
(d)
0.001

25.
Suppose that buses arrive at a bus stop every 15 minutes and that the waiting time for the next bus to arrive has a uniform distribution on the interval from 0 to 15 minutes. Find the probability that a person's waiting time will exceed 10 minutes.

(a)
5/15
(b)
4/15
(c)
1/10
(d)
1/2
26.
If X has a normal distribution with mean 30 and standard deviation 5, which of the following has the greatest probability?

(a)
X < 35
(b)
X > 30
(c)
X > 20
(d)
X < 37.5

27.
The scores of a national achievement test were approximately normally distributed with a mean of 540 and a standard deviation of 110. If you achieve a score of 680, what percentage of those who took the examination score lower than you? Use the following probability module output.

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(680, 540, 110)
[1] 0.8984426
Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(.80, 540, 110)
[1] 632.5783


(a)
0.800
(b)
0.102
(c)
0.898
(d)
0.975

28.
Refer to the situation and output given in the previous question. C College admits students whose score on the test is among the top 20%. What is the lowest score that would guarantee admission?

(a)
680.0
(b)
632.6
(c)
540.0
(d)
650.0

29.
A study was performed to investigate the relationship between the carburetor jetting size and the time of a Camaro for a quarter-mile run. The data are:

            Jet Size     76      68      70      72      74     76
Time        15.08   14.60   14.50   14.53   14.79   15.02


The Wilcoxon analysis output (from the regression module) is given below :

                Coef    Std. Err   t-ratio
intercept  9.4675300   2.3165700   4.08687
Jet        0.0724995   0.0318255   2.27803


Use the results above to predict the time for a jet size of 76. What is the predicted time?

(a)
15.05
(b)
15.02
(c)
15.08
(d)
14.98

30.
Consider the situation in the previous question. Use the Wilcoxon fit to obtain a 95% confidence interval for the slope parameter and use it to test:
• H0 : Slope is 0
• H1 : Slope is not 0.
The interval and conclusion are:

(a)
(0.01 , 0.14) ; Slope is not 0.
(b)
(0.04 , 0.10) ; Slope is not 0.
(c)
(-1.89 , 2.03) ; Slope is 0.
(d)
(0.04 , 0.10) ; Slope is 0.
31.
To decide whether a newly developed gasoline additive increases gas mileage you will compare the gas mileage for cars with and without the additive. A recent study randomly selected a single group of 5 cars and had each of the 5 cars driven both with and without the additive.
With(Y)         :       25.7    20.0    28.4    13.5    18.4
Without(X)      :       24.9    18.8    27.7    13.0    18.8
Diff(Y-X)       :

What is the value of the signed rank Wilcoxon test statistic?

(a)
14
(b)
9
(c)
0.65
(d)
15

32.
Consider the data and context of the above question. What is the value of the centered signed rank Wilcoxon test statistic, i.e. the expected under H0?

(a)
0.65
(b)
7.5
(c)
14
(d)
10

33.
Consider the data and context of the previous 2 questions. Suppose our interest is in determining whether there is a difference between the two additives. So, the differences were computed and put in the class code for paired Wilcoxon. The class code reported a 95% confidence interval for the difference. The interval is (0.05 , 1.0). What conclusion do you draw based on the interval?

(a)
There is a difference.
(b)
There is no difference.
(c)
Inconclusive.
(d)
Not enough information.

34.
A group of 9 students were randomly assigned to be taught by two different teaching techniques. They were tested at the end of a specified period of time. The following are the data.

Technique 1	:	65	87	73	79
Technique 2	:	75	69	83	81	72


What type of design is this?

(a)
Randomized paired design
(b)
Completely randomized design
(c)
Controlled regression design
(d)
Latin square design

35.
Regression was performed using a response variable (Y) and a predictor (X). The regression equation obtained is . Does the data show regression towards the mean?

(a)
Yes
(b)
No
(c)
Insufficient information.

Next: Bibliography Up: No Title Previous: Notation and Abbreviations

2001-01-01