Final Stat 160 Spring Term 2003 April 2003 NAME: FORM A

The next two problems refer to:

1.
Let X denote the height of blue spruce trees in a field close to an industrial area and let Y denote the height of blue spruce trees in a natural area. Suppose we randomly select 12 trees from each field that are the same age. Hence, we are using a completely randomized design. We are interested in the hypotheses:

Ho : The trees in both fields are about the same height.
Ha : The trees in the industrial field are typically shorter than those
in the natural field.


Let T denote the Wilcoxon statistic (number of times Y beats X). The expected value of T under the null hypothesis is.

(a).
12
(b).
72
(c).
144
(d).
24

2.
A 95% CI for the shift in heights between the trees in the natural field and the industrial field is given by (1.0,12.6). What is the correct decision with regards to the hypotheses.

(a).
reject Ho, trees in the natural field are taller than those in the industrial field.
(b).
reject Ha, trees in both fields are about the same height.
(c).
Inconclusive.
(d).
Sample sizes are far too small.

3.
A long term study of accidents at a Quincy shoe factory led management to conclude that the number of accidents per person during a year (X) is distributed according to the Poisson law. The average number of accidents per person per year was 0.3. The mean and variance of X are

(a).
0.3 , 0.09
(b).
0.3 , 0.548
(c).
0.3 , 0.3
(d).
0.6 , 0.6

4.
Suppose that a political candidate, Jonesy, claims that he will gain more than 50% of the votes in a city election and thereby emerge as the winner. The hypotheses are Ho: Jonesy will get at most 50% of the votes; and Ha: Jonesy will get more than 50% of the votes. Which of the following statement is TRUE?
(a)
A type 2 error is made if we let Jonesy win.
(b)
A type 2 error is committed if we believe that Jonesy will lose and then he emerged as the winner.
(c)
A type 1 error is committed if we believe that Jonesy was cheated in the election.
(d)
We are 50% confident that Jonesy will win the election.

5.
100 5th graders were asked the following question: If you were asked to pick between the following ice cream flavors, which flavor would you choose?

The results of this survey are given below:

 Vanilla Chocolate Strawberry Total 35 55 10 100

What is a 95% Confidence Interval for the difference in proportions between those that like chocolate and those that like vanilla ice cream?

(a.)
(.35, .55)
(b.)
(.018, .382)
(c.)
( 0, .2)
(d.)
(.2, 1)

6.
An office furniture manufacturer installed a new adhesive application process, which, he believes to be better than the old process. Random samples were selected from the two processes, and "pull tests" were performed to determine the number of pounds of pressure that were required to pull apart the glued parts. Let X and Y denote the pounds of pressure needed for the old and new processes, respectively.

(i)
Y tends to be larger than X.
(ii)
X tends to be larger than Y.
(iii)
X and Y tend to differ.
(iv)
Y tends to be same as X.

What are the hypothesis being tested.

(a).
(iv) versus (iii)
(b).
(iv) versus (ii)
(c).
(iv) versus (i)
(d).
(i) versus (ii)

7.
The width of bolts of fabric is normally distributed with mean 950mm (millimeters) and standard deviation 10mm. What is the the probability that a randomly chosen bolt has a width between 947 and 958mm?
(a)
0.2119
(b)
0.4061
(c)
0.3821
(d)
0.8643

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(958, 950, 10)
[1] 0.7881446
Rweb:>

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(947, 950, 10)
[1] 0.3820886
Rweb:>

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(961, 950, 10)
[1] 0.864334
Rweb:>


8.
Scores on a Stat 160 midterm examination are assumed to be normally distributed with mean 78 and variance 36. Suppose that students scoring in the top 10% of this distribution are to receive an A grade. What is the miminimum score a student must achieve to earn an A grade?
(a)
70
(b)
86
(c)
81
(d)
90

Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.10, 78, 6)
[1] 70.31069
Rweb:>

Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.90, 78, 6)
[1] 85.68931
Rweb:>

Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.90, 78, 3)
[1] 81.84465
Rweb:>


9.
Suppose that we are in one of those rare times when 65% of the American public approve of the way the President of the United States is handling his job. We take a random sample of 8 Americans and let X denote the number who give approval. Then the distribution of X is

(a).
Binomial n=8, p=.35.
(b).
Binomial n=8, p=.65.
(c).
Poisson
(d).
Normal and .

10.
Suppose you are looking at a scatterplot of Y versus X values and decide to fit a linear model. You notice, however, some outliers in the Ys. Which estimate of slope is best and why?

(a).
The Wilcoxon because it is not robust to outlying Ys.
(b).
The Wilcoxon because it is robust to outlying Ys.
(c).
The least squares estimate because it is robust to outlying Ys.
(d).
The least squares estimate because it is not robust to outlying Ys.

11.
In baseball, are your throwing hand and the side you bat independent? A survey was conducted of 500 randomly selected college baseball players. The results are given below:

 Bat Left Handed Bat Right Handed Total Throw Left Handed 50 50 100 Throw Right Handed 150 250 400 Total 200 300 500

A Chi-Square goodness of fit test was performed at a 95% level of confidence. The resulting Test Statistic was determined to be 13.3829 with a corresponding p-value of .0003. Based upon this information, please state your conclusion to the following hypothesis:

Ho: Throwing hand and batting side are independent
Ha: Throwing hand and batting side are not independent

(a.)
Reject Ho. Throwing hand and batting side are independent.
(b.)
Reject Ho. Throwing hand and batting side are not independent.
(c.)
Do Not Reject Ho. Throwing hand and batting side are not independent.
(d.)
Do Not Reject Ho. Throwing hand and batting side are independent.

For the next two problems: Let p equal the proportion of Americans who select jogging as one of their recreational activities. Suppose 1497 out of a random sample of 5757 selected jogging.

12.
Determine the sample proportion who selected jogging as one of their recreational activities.

(a).
0.260
(b).
0.50
(c).
1497
(d).
5757

13.
An approximate 95% confidence interval for p is.

(a).
(0.452,0.557)
(b).
(0.248,0.339)
(c).
(0.119,0.271)
(d).
(0.248,0.271)

14.
Which of the following is NOT TRUE about the errors associated in the test of hypothesis?

(a)
The probability of a type 1 error is denoted by alpha.
(b)
A type I error is made if we reject a true Ho.
(c)
We commit a type 2 error when we accept a true alternative hypothesis (Ha).
(d)
Usually type 1 error is regarded as the more serious error.

15.
Suppose that there are 14 songs on a compact disk (CD) and you like 8 of them. When using the random button selector on a CD player, each of the 14 songs is played once in a random order. Find the probability that among the first 2 songs that are played, you like both of them. (hint: use tree diagram).

(a).
0.458
(b).
0.326
(c).
0.184
(d).
0.308

The next two problems refer to:

Do you hate Monday? Researchers in Germany have provided another reason for you: they concluded that the risk of heart attack on a Monday for working person may be as much as 50% greater than on any other day (Riverside Press-Enterprise, Nov 17, 1992). In an attempt to verify the researcher's claim, 203 working people who had recently had heart attacks were surveyed. The day on which their heart attacks occurred appear in the following table:

Sunday  Monday  Tuesday  Wednesday  Thursday  Friday  Saturday
24      36      27        26         33       27       30


Let H0 be that everyday of the week is equilikely for a heart attack.

16.
What is the expected frequency of Monday heart attacks under H0?
(a)
20
(b)
24
(c)
29
(d)
40

17.
What is the value of the chi-square test statistic?
(a)
11.70
(b)
3.72
(c)
18.24
(d)
29

18.
The fracture strength of tempered glass averages 14 (measured in thousands of pounds per square inch) and has standard deviation 2. What is the probability that the average fracture strength of 100 randomly selected pieces of this glass exceeds 14.5?
(a)
0.9938
(b)
0.5987
(c)
0.4013
(d)
0.0062

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14.5, 14, 2)
[1] 0.5987063
Rweb:>

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14.5, 14, .2)
[1] 0.9937903
Rweb:>

Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14, 14.5, 2)
[1] 0.4012937
Rweb:>


The next 2 questions pertain to the following situation:

The following are comparison boxplots between home prices in Kalamazoo and Mt. Pleasant, based on samples of size 1000 for each city. Using these comparison boxplots, answer the following questions:

19.
Are there outliers in either data set?

(a.)
Kalamazoo has 1 outlier, Mt. Pleasant has 1 outlier.
(b.)
No outliers for either city.
(c.)
Kalamazoo has no outliers, Mt. Pleasant has 1 outlier.
(d.)
Kalamazoo has 2 outliers, Mt. Pleasant has 3 outliers.

20.
Keeping the large sample sizes in mind, does there appear to be a shift in location between typical home prices in Kalamazoo and Mt. Pleasant?
(a.)
Yes, homes in Kalamazoo appear to be $30,000 more expensive than homes in Mt. Pleasant. (b.) Cannot determine any difference between the price of homes in Kalamazoo and Mt. Pleasant. (c.) Cannot compare the cities because the same scale was used on each boxplot. (d.) Yes, homes in Mt. Pleasant appear to be$200,000 less expensive than homes in Kalamazoo.

21.
The cycle time for trucks hauling concrete to a highway construction site is uniformly distributed over the interval 50 to 70 minutes. What is the probability that the cycle time exceeds 65 minutes?
(a)
0.25
(b)
0.77
(c)
0.50
(d)
0.65

The next 4 questions pertain to the following situation:

A high school teacher has decided to build a model that will predict final grades in her English grammar class. Using data from the previous semester, the teacher has the final grade scores, final exam scores, and attendance records (number of days missing class) from 35 students. She decides to build two Least-Squares Linear Regression Models. The first will predict a student's final grade score using final exam score. The second model will predict a students' final grade score using number days missing class as a predictor. The scatter plots for the both models are as shown:

The RWEB results of the first model are:

First Model:  Predicting Final Grade With Final Exam Score

Coefficients:

Estimate   Std. Error  t value  Pr(>|t|)

(Intercept)               8.36462    6.74501      1.240    0.224
final exam score          0.88701    0.09027      9.826 2.51e-11

Residual standard error: 5.38 on 33 degrees of freedom Multiple
R-Squared: 0.7453,  Adjusted R-squared: 0.7376 F-statistic: 96.56
on 1 and 33 DF,  p-value: 2.506e-11

The RWEB results of the second model are:

Second Model:  Predicting Final Grade with Attendance Record.

Coefficients:
Estimate   Std. Error    t value   Pr(>|t|)
(Intercept)           78.100      2.350       33.24     <2e-16
Days missed class      -2.633      1.079       -2.44     0.0202

Residual standard error: 9.811 on 33 degrees of freedom Multiple
R-Squared: 0.1529, Adjusted R-squared: 0.1272 F-statistic: 5.955
on 1 and 33 DF,  p-value: 0.02021


22.
A student scores an 85 on the final exam. Using the appropriate model, predict the final grade score.
(a.)
Cannot do. This is extrapolation.
(b.)
85.0
(c.)
8.36
(d.)
83.76

23.
A student has missed a total of 8 days of class. Using the appropriate model, predict the final grade score.

(a.)
99.17
(b.)
57.036
(c.)
78.10
(d.)
Cannot do. This is extrapolation

24.
The teacher would like to know which model does a better job of predicting final exam scores. Using only the information given, which of the two models appears to do a better job of predicting final grade scores?

(a.)
The second model is superior, because the intercept (78.1) is greater than the intercept of the first model (8.36)
(b.)
The second model is superior, because a slope of -2.633 is less than the slope of the first model (.8871)
(c.)
The first model is superior, because a higher percentage of the variation in final grades (R-Squared) is explained by the first model.
(d.)
The first model is superior because it uses continuous data, which is more informative than the discrete data used in the second model.

25.
Ignoring the results of the other questions, assume a 95% Confidence for the slope of Model #2 (attendance model) is (-4.73, -.518). Based on this information, can we conclude that attendance is a significant predictor of final exam score?

(a.)
Yes, you will always do worse when you miss class.
(b.)
No, since the slope is negative.
(c.)
No, without a residual plot is it impossible to answer this question.
(d.)
Yes, since 0 is not in the 95% Confidence Interval.

26.
The results of a survey in which respondents were asked the number of times that they had changed jobs (X) during the past five years. Given the distribution of X as shown in the table, what is the expected value of X.

No. of job changes(X) :  0      1       2       3        4        5
Probability (X)       :  0.3    0.4     0.2     0.05     0.03     0.02


(a).
1.17
(b).
0.166
(c).
2.5
(d).
0.50

The next 4 questions pertain to the following situation:

Farmer Mike believes his llamas produce a fertilizer that provides a higher yield of corn than his existing fertilizer. Farmer Mike divides a 5 acre field into 10 plots of acre each. He randomly assigns Llama Fertilizer" to 5 plots and Cow Fertilizer" to the remaining plots. Let Y= the yield (in bushels of corn) for the Llama fertilizer and let X= the yield (in bushels of corn) for the Cow fertilizer. Farmer Mike wishes to test if the Llama fertilizer produces higher yields than the Cow fertilizer.

The results of this experiment are given below:

Y: 250 270 280 290 350
X: 200 210 230 240 250

Based on this information, answer the following questions:

27.
What type of experimental design has Farmer Mike used?
(a.)
Completely Randomized Design.
(b.)
Randomized Pair Design.
(c.)
Lurking Variable Design.
(d.)
Wilcoxon Regression Design.

28.
Assuming a Wilcoxon testing procedure is used, what is the Test Statistic (T) and Expected Value of the Test Statistic E(T) assuming the Null Hypothesis (Ho: Y=X) is true?

(a.)
T = 0.5 E(T) = 24.5
(b.)
T = 24.5 E(T) = 12.5
(c.)
T = 12.5 E(T) = 25
(d.)
T = 25 E(T) = 12.5

29.
Assuming a Wilcoxon testing procedure is used on the differences Y-X, what is the estimate of =Y-X, the difference between corn yields using Llama fertilizer versus corn yields using Cow fertilizer

(a.)
62 bushels
(b.)
280 bushels
(c.)
50 bushels
(d.)
100 bushels

Some output from RWEB is as follows:

Alternative Hypothesis:   Ha: Y < X     Ha: Y <> X    Ha: Y > X
p-value:     .998            .002          .001


30.
Choosing the correct Alternative Hypothesis, what is the correct conclusion for this experiment?

(a.)
Do Not Reject Ho, p-value = .998
(b.)
Reject Ho, p-value = .002
(c.)
Reject Ho, p-value = .001
(d.)
Do Not Reject Ho, p-value = .002

For the next two problems:

A box contains 2 gold balls, 3 silver balls and 5 blue balls. A game is played in such a way that a person will put both his hands in the box and draw two balls at the same time, 1 ball on each hand. The contestant will win if he gets the two gold balls and he will have the consolation price if he gets two silver balls.

Consider the following resampling model to determine this probability. Select 10 single digit random numbers from 0,1,...9 without replacement. Let the number 0-1 represent the gold balls, 2-4 represent the silver balls and 5-9 represent the blue balls. Use the result of 20 trials if this resampling model given below:

31.
What is the estimate of the probability of a contestant winning a consolation prize?
(a)
0.30
(b)
0.25
(c)
0.20
(d)
0.15

32.
What is the error of estimation of a contestant receiving a consolation prize?
(a)
0.0798
(b)
0.0357
(c)
0.1789
(d)
0.1275

For the next two problems:

Let X equal the thickness of peppermint gum that is manufactured for vending machines. Assume that the distribution of X is normally distributed with mean . The following are n=10 thicknesses, in hundredth's of an inch, on pieces of gum that were selected randomly from the population line.

7.50  7.95  7.55  7.40  7.45  7.35  7.45  7.45  7.45  7.50


33.
Determine the sample mean:
(a).
7.505
(b).
7.45
(c).
10
(d).
1.96

34.
If the sample standard deviation is given by s=0.166, which of the following is a 95% confidence interval for .

(a).
(7.347,7.553)
(b).
(7.402,7.608)
(c).
(9.897,10.103)
(d).
(1.857,2.063)

35.
Let X and Y equal the blood volumes in millimeters for males who are paraplegics participating in vigorous physical activities and males who are able bodied participating in normal activities, respectively. We seek an estimate of the difference in locations, , of Y-X.

Observations of Y are
1612  1352  1456  1222  1560  1456  1924

Observations of X are
1082  1300  1092  1040  910  1248  1092  1040  1092  1288


An estimate for , using sample medians is

(a).
393.3143
(b).
364
(c).
0
(d).
Sample sizes are far too small.