Stat 216 Final Exam Review Problems
Note: The final exam will also include some exam 2 topics not covered here.
For the next 20 problems, consider the Emeadow data listed on page F4 of Appendix F. Appraised Value (in thousands of dollars), number of Bedrooms, number of Bathrooms, and Age of the house are given, among other variables, for 74 homes in East Meadow, New York.
For now, let’s try to predict appraised Value with the number of Bathrooms.
1.) Which of these two variables is the response, or Y, variable? What does that make the other variable? [Go to answer]
2.) What does the following scatterplot tell us about the relationship between Value and Bathrooms? [Go to answer]
Note: Excel/PhStat automatically and incorrectly put Value on the horizontal axis, and Bathrooms on the vertical axis. When doing regressions, you should look out for this – take a close look at the units on each axis and decide if the plot has the variables on the appropriate axes. If not, it can be fixed by doing the following: Click on the sheet for the scatterplot. Pull down the Chart menu and choose Source Data. Click on Series. Now change X values from a2:a75 to d2:d75 (which is the cell range for Bathrooms), and change Y values from d2:d75 to a2:a75 (which is the cell range for Value). Now the plot will be correct.
3.) Here is the regression output from Excel/PhStat. What is the regression equation? Use it to predict the average Value of houses with two Bathrooms. [Go to answer]
4.) Interpret the slope. [Go to answer]
5.) How much variation in Value is explained by Bathrooms? [Go to answer]
6.) What is the standard error of the estimated regression line? [Go to answer]
7.) Is there evidence at a = .05 that Bathrooms is a significant linear predictor of Value? [Go to answer]
8.) Give a 95% confidence interval for the true slope. [Go to answer]
9.) Suppose we are interested in houses with one Bathroom. See the following estimation output. What is a 99% interval estimate for the average Value of such homes? [Go to answer]
10.) Now suppose we are interested in a house with three Bathrooms. See the following estimation output. What is a 99% interval estimate for the Value of a house with that many bathrooms? [Go to answer]
11.) Here is the residual plot for this regression. What does this say about the fit of our model? [Go to answer]
Now, let’s try to improve our predictive power by including a few more explanatory variables. Let’s predict Value using number of Bedrooms, number of Bathrooms, and Age of the house. The Excel/PhStat output is below.
Note: Once again, there is a bit of a "trick" to get want you want from the software. When you specify the X variables cell range, you’ll notice that all of the explanatory variables need to be in consecutive columns. Unused variables – in this case, Rooms – need to be moved or temporarily deleted. So just select the Rooms column (E), and choose Delete from the Edit menu. (You won’t be allowed to save it this way, so don’t worry about contaminating the data set for the next person.) Then you can specify c1:e75 for the X variables. This will pick Bedrooms, Bathrooms, and Age as the explanatory variables.
12.) What is the regression equation? [Go to answer]
13.) Interpret the slope estimate for the Age variable. [Go to answer]
14.) Predict the average Value of 20 yearold houses with 3 Bedrooms and 1½ Bathrooms. Is this extrapolation? [Go to answer]
15.) How much variation is accounted for by the three X variables? [Go to answer]
16.) What is the standard error of the estimate? [Go to answer]
17.) Determine if there is a significant relationship between Value and the three explanatory variables at a = .05. [Go to answer]
18.) At a = .05, determine whether each of the three X variables makes a significant contribution to the regression model or not. Be sure to write out your conclusions for all three X’s and what results (i.e., test statistics, pvalues) you used to make each conclusion. [Go to answer]
19.) Give a 90% confidence interval for the true slope associated with the Bathroom variable. [Go to answer]
20.) Consider the residual plots for this regression problem. How does the model appear to fit the data? [Go to answer]
Note: This last plot, residuals vs. fitted values, was not automatically output by Excel/PhStat with the Multiple regression procedure. This plot had to be generated with the "Chart Wizard", and the details won’t be provided here.
21.) Before commercials are placed on national television, they undergo testing and modification. Marketing researchers often show one version of a commercial to half the broadcasting audience and a second version to the other half. Then a followup telephone survey is conducted to measure the impact of the ad. For the following example, are the two versions of the commercial equally remembered? Use a = .05. [Go to answer]
Commercial 
Don't remember 
Remember seeing 
Remember key point 
Version A 
19 
24 
37 
Version B 
24 
28 
18 
22.) Imported goods can be challenged for infringement of a U.S. patent, copyright, or trademark under Section 337 of a tariff act. Once a section 337 challenge is brought to the International Trade Commission, it results in one of three decisions. The results for 190 challenges, involving three countries, are given below, along with the Excel/PhStat c ^{2} test output. Is there evidence that trade violation results depend on the country in question? Use a = .10. [Go to answer]
23.) The Equal Credit Opportunity Act forbids the lenders in the US from asking the marital status of women who are applying for personal loans. Many women feel that this act should be extended to include business loans, citing instances where women received business loans only after the lender determined that they were married to men who had good credit ratings (Business Week, 27 May 1985). Suppose that a women’s group has collected data on the business loan applications of 600 women, and that the results are as summarized below. Is there evidence of bias on the part of lenders regarding marital status? Use a = .05. [Go to answer]
Loan 

Marital Status 
Granted 
Denied 
Single 
253 
119 
Married 
181 
47 
24.) A sample of 400 union labor contracts was selected and classified according to two characteristics: duration of contract and type of industry. Based on the Excel/PhStat output, is the duration of union contracts independent of type of industry? Use a = .01. [Go to answer]
1.)
Value is the response, or Y variable.
So Bathrooms is the explanatory, or X variable.
2.) We have a positive (or increasing) relationship here. As the number of Bathrooms increases, Value increases.
3.)
The regression equation is
Value = 135.97 + 43.93Bathrooms.
Yhat = 135.97 + 43.93(2) = 223.83, or $223,830.
4.) As the number of Bathrooms increases by one, Value increases by 43.93 units (actually $43,930).
5.) R^{2} = 48.9%
6.) s = 27.7 ($27,700)
7.)
H_{0}: b_{1} = 0
H_{1}: b_{1} not= 0
t = 8.30
pvalue = approximately zero < .05
Reject H_{0}. Yes, Bathrooms is a significant predictor
of Value.
8.)
b_{1} +/ t_{.025}(n2) s_{b1}
43.93 +/ 1.9935(5.2924)
43.93 +/ 10.55
(33.38, 54.48)
This is also provided in the output.
9.)
This is a confidence interval. From the output, we have
(165.783, 194.025), or ($165783, $194025).
10.)
This is a prediction interval. From the output, we have
(192.086, 343.446), or ($192086, $343446).
11.) The plot does not show any curves or patterns, so the model fits the data adequately.
12.) Value = 154.067 + 6.591Bedrooms + 37.484Bathrooms  0.914Age
13.) As Age increases by one year, Value decreases by 0.914 units ($914), as long as we hold Bedrooms and Bathrooms constant.
14.)
Yhat = 154.067 + 6.591(3) + 37.484(1.5)  0.914(20) = 211.786 ($211,786)
To determine if this is extrapolation or not, we need to look at the range of
values for the three X variables. One way is to get the descriptive statistics
for Age, Bedrooms and Bathrooms, in order to see what the
minimum and maximum values are for each.
15.) R^{2} = 51.86%
16.) s = 27.275 ($27,275)
17.)
F = 25.14
pvalue = approximately zero < .05
Reject H_{0}: b_{1} =
b_{2} =
b_{3} = 0.
There is a significant relationship between Value and the
three X variables.
Bedrooms:
H_{0}: b_{1} = 0
H_{1}: b_{1} not= 0
t = 1.52
pvalue = .1337 > .05
Do not reject H_{0}.
No, Bedrooms is not a significant predictor of Value.
Bathrooms:
H_{0}: b_{2} = 0
H_{1}: b_{2} not= 0
t = 6.10
pvalue = approx. zero < .05
Reject H_{0}.
Yes, Bathrooms is a significant predictor of Value.
Age:
H_{0}: b_{3} = 0
H_{1}: b_{3} not= 0
t = 1.54
pvalue = .1276 > .05
Do not reject H_{0}.
No, Age is not a significant predictor of Value.
19.)
The output is for 95%!! We'll have to do it by hand:
b_{2} +/ t_{.05}(n4) s_{b2}
37.48 +/ 1.6669(6.1418)
37.48 +/ 10.238
(27.242, 47.718)
20.) None of the four residual plots have any pattern, so the fit of the model seems to be adequate.
21.)
Here is the table with totals and expected frequencies added:
Commercial 
Don't remember 
Remember seeing 
Remember key point 

Version A 
19 (22.93) 
24 (27.73) 
37 (29.33) 
80 
Version B 
24 (20.07) 
28 (24.27) 
18 (25.67) 
70 
43 
52 
55 
150 
22.)
H_{0}: Decision & Country are independent
H_{1}: Decision depends on Country
c^{2} = 18.9223
pvalue = .00081 < .10
Reject H_{0}. Trade violation results seems to depend on
which country it is.
23.)
Here is the table with totals and expected frequencies added:
Loan 

Marital Status 
Granted 
Denied 

Single 
253 (269.08) 
119 (102.92) 
372 
Married 
181 (164.92) 
47 (63.08) 
228 
434 
166 
600 
24.)
H_{0}: p_{1} = p_{2} = p_{3}
H_{1}: At least one pair p_{i} not= p_{j}
c^{2} = 4.1566
pvalue = .125144 > .10
Do not reject H_{0}.
The duration of contracts seems to be independent of industry type.