- 1.
- Let X denote the height of blue spruce trees in a field close to an industrial
area and let Y denote the height of blue spruce trees in a natural area.
Suppose we randomly select 12 trees from each field that are the same age.
Hence, we are using a completely randomized design.
We are interested in the hypotheses:
Ho : The trees in both fields are about the same height.
Ha : The trees in the industrial field are typically shorter than those
in the natural field.
Let T denote the Wilcoxon statistic (number of times Y beats X).
The expected value of T under the null hypothesis is.
- (a).
- 12
- (b).
- 72
- (c).
- 144
- (d).
- 24
- 2.
- A 95% CI for
the shift in heights between the
trees in the natural field and the industrial field
is given by
(1.0,12.6).
What is the correct decision with regards to the hypotheses.
- (a).
- reject Ho, trees in the natural field are taller than those in the industrial field.
- (b).
- reject Ha, trees in both fields are about the same height.
- (c).
- Inconclusive.
- (d).
- Sample sizes are far too small.
- 3.
- A long term study of accidents at a Quincy shoe factory led management to
conclude that the number of accidents per person during a year (X) is
distributed according to the Poisson law.
The average number of accidents per person per year was 0.3.
The mean and variance of X are
- (a).
- 0.3 , 0.09
- (b).
- 0.3 , 0.548
- (c).
- 0.3 , 0.3
- (d).
- 0.6 , 0.6
- 4.
- Suppose that a political candidate, Jonesy, claims that he will gain more than 50%
of the votes in a city election and thereby
emerge as the winner. The hypotheses are Ho: Jonesy will get at most 50% of the votes;
and Ha: Jonesy will get more than 50%
of the votes. Which of the following statement is TRUE?
- (a)
- A type 2 error is made if we let Jonesy win.
- (b)
- A type 2 error is committed if we believe that Jonesy will lose and then he emerged as the winner.
- (c)
- A type 1 error is committed if we believe that Jonesy was cheated in the election.
- (d)
- We are 50% confident that Jonesy will win the election.
- 5.
- 100 5th graders were asked the following question: If you
were asked to pick between the following ice cream flavors, which
flavor would you choose?
The results of this survey are given below:
| Vanilla |
Chocolate |
Strawberry |
Total |
| 35 |
55 |
10 |
100 |
What is a 95% Confidence Interval for the difference in proportions
between those that like chocolate and those that like vanilla ice cream?
- (a.)
- (.35, .55)
- (b.)
- (.018, .382)
- (c.)
- ( 0, .2)
- (d.)
- (.2, 1)
- 6.
- An office furniture manufacturer installed a new adhesive application
process, which, he believes to be better than the old process.
Random samples were selected from the two processes, and "pull tests" were
performed to determine the number of pounds of pressure that were required to
pull apart the glued parts. Let X and Y denote the pounds of pressure
needed for the old and new processes, respectively.
- (i)
- Y tends to be larger than X.
- (ii)
- X tends to be larger than Y.
- (iii)
- X and Y tend to differ.
- (iv)
- Y tends to be same as X.
What are the hypothesis being tested.
- (a).
- (iv) versus (iii)
- (b).
- (iv) versus (ii)
- (c).
- (iv) versus (i)
- (d).
- (i) versus (ii)
- 7.
- The width of bolts of fabric is normally distributed with mean 950mm (millimeters) and standard deviation 10mm. What is the
the probability that a randomly chosen bolt has a width between 947 and 958mm?
- (a)
- 0.2119
- (b)
- 0.4061
- (c)
- 0.3821
- (d)
- 0.8643
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(958, 950, 10)
[1] 0.7881446
Rweb:>
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(947, 950, 10)
[1] 0.3820886
Rweb:>
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(961, 950, 10)
[1] 0.864334
Rweb:>
- 8.
- Scores on a Stat 160 midterm examination are assumed to be normally distributed with mean 78 and variance 36.
Suppose that students scoring in the top 10% of this distribution
are to receive an A grade. What is the miminimum score a student must
achieve to earn an A grade?
- (a)
- 70
- (b)
- 86
- (c)
- 81
- (d)
- 90
Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.10, 78, 6)
[1] 70.31069
Rweb:>
Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.90, 78, 6)
[1] 85.68931
Rweb:>
Rweb:> # NORMAL PERCENTAGE POINT
Rweb:> qnorm(0.90, 78, 3)
[1] 81.84465
Rweb:>
- 9.
- Suppose that we are in one of those rare times when
65% of the American public approve of the way the President of the
United States is handling his job. We take a random sample of 8
Americans and let X denote the number who give approval. Then the distribution of X is
- (a).
- Binomial n=8, p=.35.
- (b).
- Binomial n=8, p=.65.
- (c).
- Poisson
- (d).
- Normal
and
.
- 10.
- Suppose you are looking at a scatterplot of Y versus X values
and decide to fit a linear model.
You notice, however, some outliers in the Ys.
Which estimate of slope is best and why?
- (a).
- The Wilcoxon because it is not robust to outlying Ys.
- (b).
- The Wilcoxon because it is robust to outlying Ys.
- (c).
- The least squares estimate because it is robust to outlying Ys.
- (d).
- The least squares estimate because it is not robust to outlying Ys.
- 11.
- In baseball, are your throwing hand and the side you bat
independent? A survey was conducted of 500 randomly selected
college baseball players. The results are given below:
| |
Bat Left Handed |
Bat Right Handed |
Total |
| Throw Left Handed |
50 |
50 |
100 |
| Throw Right Handed |
150 |
250 |
400 |
| Total |
200 |
300 |
500 |
A Chi-Square goodness of fit test was performed at a 95% level
of confidence. The resulting Test Statistic was determined to be
13.3829 with a corresponding p-value of .0003. Based upon this
information, please state your conclusion to the following
hypothesis:
Ho: Throwing hand and batting side are independent
Ha: Throwing hand and batting side are not independent
- (a.)
- Reject Ho. Throwing hand and batting side are
independent.
- (b.)
- Reject Ho. Throwing hand and batting side are not
independent.
- (c.)
- Do Not Reject Ho. Throwing hand and batting side are
not independent.
- (d.)
- Do Not Reject Ho. Throwing hand and batting side are
independent.
For the next two problems:
Let p equal the proportion of Americans who select jogging as one of
their recreational activities. Suppose 1497 out of a random sample of 5757
selected jogging.
- 12.
- Determine the sample proportion who selected jogging
as one of
their recreational activities.
- (a).
- 0.260
- (b).
- 0.50
- (c).
- 1497
- (d).
- 5757
- 13.
- An approximate 95% confidence interval for p is.
- (a).
- (0.452,0.557)
- (b).
- (0.248,0.339)
- (c).
- (0.119,0.271)
- (d).
- (0.248,0.271)
- 14.
- Which of the following is NOT TRUE about the errors associated in the test of hypothesis?
-
- (a)
- The probability of a type 1 error is denoted by alpha.
- (b)
- A type I error is made if we reject a true Ho.
- (c)
- We commit a type 2 error when we accept a true alternative hypothesis (Ha).
- (d)
- Usually type 1 error is regarded as the more serious error.
- 15.
- Suppose that there are 14 songs on a compact disk (CD) and you like 8 of them.
When using the random button selector on a CD player, each of the 14 songs is played
once in a random order. Find the probability that among the first 2 songs that are
played, you like both of them. (hint: use tree diagram).
- (a).
- 0.458
- (b).
- 0.326
- (c).
- 0.184
- (d).
- 0.308
The next two problems refer to:
Do you hate Monday? Researchers in Germany have provided another reason for you:
they concluded that the risk of heart attack
on a Monday for working person may be as much as 50% greater than on any other day
(Riverside Press-Enterprise, Nov 17, 1992).
In an attempt to verify the researcher's claim, 203 working people who had
recently had heart attacks were surveyed. The day
on which their heart attacks occurred appear in the following table:
Sunday Monday Tuesday Wednesday Thursday Friday Saturday
24 36 27 26 33 27 30
Let H0 be that everyday of the week is equilikely for a heart attack.
- 16.
- What is the expected frequency of Monday heart attacks under H0?
- (a)
- 20
- (b)
- 24
- (c)
- 29
- (d)
- 40
- 17.
- What is the value of the chi-square test statistic?
- (a)
- 11.70
- (b)
- 3.72
- (c)
- 18.24
- (d)
- 29
- 18.
- The fracture strength of tempered glass averages 14 (measured in thousands of pounds per square inch) and has standard
deviation 2. What is the probability that the average fracture strength of 100 randomly selected pieces of this glass exceeds
14.5?
- (a)
- 0.9938
- (b)
- 0.5987
- (c)
- 0.4013
- (d)
- 0.0062
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14.5, 14, 2)
[1] 0.5987063
Rweb:>
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14.5, 14, .2)
[1] 0.9937903
Rweb:>
Rweb:> # CUMULATIVE NORMAL DISTRIBUTION
Rweb:> pnorm(14, 14.5, 2)
[1] 0.4012937
Rweb:>
The next 2 questions pertain to the following situation:
The following are comparison boxplots between home prices in
Kalamazoo and Mt. Pleasant, based on samples of size 1000 for each
city. Using these comparison boxplots, answer the following questions:
- 19.
- Are there outliers in either data set?
- (a.)
- Kalamazoo has 1 outlier, Mt. Pleasant has 1 outlier.
- (b.)
- No outliers for either city.
- (c.)
- Kalamazoo has no outliers, Mt. Pleasant has 1 outlier.
- (d.)
- Kalamazoo has 2 outliers, Mt. Pleasant has 3
outliers.
- 20.
- Keeping the large sample sizes in mind,
does there appear to be a shift in location between typical home
prices in Kalamazoo and Mt. Pleasant?
- (a.)
- Yes, homes in Kalamazoo appear to be $30,000 more expensive
than homes in Mt. Pleasant.
- (b.)
- Cannot determine any difference between the price of homes in
Kalamazoo and
Mt. Pleasant.
- (c.)
- Cannot compare the cities because the same scale was used
on each boxplot.
- (d.)
- Yes, homes in Mt. Pleasant appear to be $200,000 less
expensive than homes in Kalamazoo.
- 21.
- The cycle time for trucks hauling concrete to a highway construction site is uniformly distributed over the interval
50 to 70 minutes. What is the probability that the cycle time exceeds 65 minutes?
- (a)
- 0.25
- (b)
- 0.77
- (c)
- 0.50
- (d)
- 0.65
The next 4 questions pertain to the following situation:
A high school teacher has decided to build a model that will
predict final grades in her English grammar class. Using data
from the previous semester, the teacher has the final grade
scores, final exam scores, and attendance records (number of days
missing class) from 35 students. She decides to build two
Least-Squares Linear Regression Models. The first will predict a
student's final grade score using final exam score. The second
model will predict a students' final grade score using number days
missing class as a predictor. The scatter plots for the both
models are as shown:
The RWEB results of the first model are:
First Model: Predicting Final Grade With Final Exam Score
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 8.36462 6.74501 1.240 0.224
final exam score 0.88701 0.09027 9.826 2.51e-11
Residual standard error: 5.38 on 33 degrees of freedom Multiple
R-Squared: 0.7453, Adjusted R-squared: 0.7376 F-statistic: 96.56
on 1 and 33 DF, p-value: 2.506e-11
The RWEB results of the second model are:
Second Model: Predicting Final Grade with Attendance Record.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 78.100 2.350 33.24 <2e-16
Days missed class -2.633 1.079 -2.44 0.0202
Residual standard error: 9.811 on 33 degrees of freedom Multiple
R-Squared: 0.1529, Adjusted R-squared: 0.1272 F-statistic: 5.955
on 1 and 33 DF, p-value: 0.02021
Using this information, please answer the following questions:
- 22.
- A student scores an 85 on the final exam. Using the
appropriate model, predict the final grade score.
- (a.)
- Cannot do. This is extrapolation.
- (b.)
- 85.0
- (c.)
- 8.36
- (d.)
- 83.76
- 23.
- A student has missed a total of 8 days of class. Using
the appropriate model, predict the final grade score.
- (a.)
- 99.17
- (b.)
- 57.036
- (c.)
- 78.10
- (d.)
- Cannot do. This is extrapolation
- 24.
- The teacher would like to know which model does a better
job of predicting final exam scores. Using only the information
given, which of the two models appears to do a better job of
predicting final grade scores?
- (a.)
- The second model is superior, because the intercept
(78.1) is greater than the intercept of the first model (8.36)
- (b.)
- The second model is superior, because a slope of
-2.633 is less than the slope of the first model (.8871)
- (c.)
- The first model is superior, because a higher
percentage of the variation in final grades (R-Squared) is
explained by the first model.
- (d.)
- The first model is superior because it uses
continuous data, which is more informative than the discrete data
used in the second model.
- 25.
- Ignoring the results of the other questions, assume a 95%
Confidence for the slope of Model #2 (attendance model) is
(-4.73, -.518). Based on this information, can we conclude that
attendance is a significant predictor of final exam score?
- (a.)
- Yes, you will always do worse when you miss class.
- (b.)
- No, since the slope is negative.
- (c.)
- No, without a residual plot is it impossible to
answer this question.
- (d.)
- Yes, since 0 is not in the 95% Confidence Interval.
- 26.
- The results of a survey in which respondents were asked the number of
times that they had changed jobs (X) during the past five years.
Given the distribution of X as shown in the table, what is the expected value of X.
No. of job changes(X) : 0 1 2 3 4 5
Probability (X) : 0.3 0.4 0.2 0.05 0.03 0.02
- (a).
- 1.17
- (b).
- 0.166
- (c).
- 2.5
- (d).
- 0.50
The next 4 questions pertain to the following situation:
Farmer Mike believes his llamas produce a fertilizer
that provides a higher yield of corn than his existing fertilizer.
Farmer Mike divides a 5 acre field into 10 plots of
acre each. He randomly assigns ``Llama Fertilizer" to 5 plots and
``Cow Fertilizer" to the remaining plots. Let Y= the yield (in
bushels of corn) for the Llama fertilizer and let X= the yield (in
bushels of corn) for the Cow fertilizer. Farmer Mike wishes to
test if the Llama fertilizer produces higher yields than the Cow
fertilizer.
The results of this experiment are given below:
Y: 250 270 280 290 350
X: 200 210 230 240 250
Based on this information, answer the following questions:
- 27.
- What type of experimental design has Farmer Mike used?
- (a.)
- Completely Randomized Design.
- (b.)
- Randomized Pair Design.
- (c.)
- Lurking Variable Design.
- (d.)
- Wilcoxon Regression Design.
- 28.
- Assuming a Wilcoxon testing procedure is used, what is the
Test Statistic (T) and Expected Value of the Test Statistic E(T)
assuming the Null Hypothesis (Ho: Y=X) is true?
- (a.)
- T = 0.5 E(T) = 24.5
- (b.)
- T = 24.5 E(T) = 12.5
- (c.)
- T = 12.5 E(T) = 25
- (d.)
- T = 25 E(T) = 12.5
- 29.
- Assuming a Wilcoxon testing procedure is used on the differences Y-X, what is the
estimate of
=Y-X, the difference between corn yields using
Llama fertilizer versus corn yields using Cow fertilizer
- (a.)
- 62 bushels
- (b.)
- 280 bushels
- (c.)
- 50 bushels
- (d.)
- 100 bushels
Some output from RWEB is as follows:
Alternative Hypothesis: Ha: Y < X Ha: Y <> X Ha: Y > X
p-value: .998 .002 .001
- 30.
- Choosing the correct Alternative Hypothesis, what is the correct
conclusion for this experiment?
- (a.)
- Do Not Reject Ho, p-value = .998
- (b.)
- Reject Ho, p-value = .002
- (c.)
- Reject Ho, p-value = .001
- (d.)
- Do Not Reject Ho, p-value = .002
For the next two problems:
A box contains 2 gold balls, 3 silver balls and 5 blue balls. A game is played in such a way that a person will put both his
hands in the box and draw two balls at the same time, 1 ball on each hand. The contestant will win if he gets the two gold balls
and he will have the consolation price if he gets two silver balls.
Consider the following resampling model to determine this probability. Select 10 single digit random numbers from 0,1,...9
without replacement. Let the number 0-1 represent the gold balls, 2-4 represent the silver balls and 5-9 represent the blue
balls. Use the result of 20 trials if this resampling model given below:
- 31.
- What is the estimate of the probability of a contestant winning a consolation prize?
- (a)
- 0.30
- (b)
- 0.25
- (c)
- 0.20
- (d)
- 0.15
- 32.
- What is the error of estimation of a contestant receiving a consolation prize?
- (a)
- 0.0798
- (b)
- 0.0357
- (c)
- 0.1789
- (d)
- 0.1275
For the next two problems:
Let X equal the thickness of peppermint gum that is manufactured for vending machines.
Assume that the distribution of X is normally distributed with mean
.
The following are n=10 thicknesses, in hundredth's of an inch, on pieces of gum that
were selected randomly from the population line.
7.50 7.95 7.55 7.40 7.45 7.35 7.45 7.45 7.45 7.50
- 33.
- Determine the sample mean:
- (a).
- 7.505
- (b).
- 7.45
- (c).
- 10
- (d).
- 1.96
- 34.
- If the sample standard deviation is given by s=0.166, which of the following
is a 95% confidence interval for
.
- (a).
-
(7.347,7.553)
- (b).
-
(7.402,7.608)
- (c).
-
(9.897,10.103)
- (d).
-
(1.857,2.063)
- 35.
- Let X and Y equal the blood volumes in millimeters for males who are
paraplegics participating in vigorous physical activities and
males who are able bodied participating in normal activities, respectively.
We seek an estimate of the difference in locations,
,
of Y-X.
Observations of Y are
1612 1352 1456 1222 1560 1456 1924
Observations of X are
1082 1300 1092 1040 910 1248 1092 1040 1092 1288
An estimate for
,
using sample medians is
- (a).
- 393.3143
- (b).
- 364
- (c).
- 0
- (d).
- Sample sizes are far too small.