Next: Index Up: Linear Regression Previous: Multiple Linear Regression

# Exercises

1.
Consider the following data:

 X Y -2 0 2 3 5 10 -1 1 6 15

(a)
Compute the correlation between Y and X.
(b)
Compute the mean and SD of X.
(c)
Compute the mean and SD of Y.
(d)
Compute the slope of the regression line for predicting Y from X.
(e)
Compute the intercept of the regression line for predicting Y from X.
(f)
Write down the regression equation for predicting Y from X.
(g)
Add 5 to Y, so the new values are 5, 8, 15, 6, 20. What do you think will happen to the slope and intercept? (Hint: What happens to the scatterplot if you add 5 to Y?). Verify your answer by calculation.
(h)
Multiply Y by 5, so the new values are 0, 15, 50, 5, 75. What do you think will happen to the slope and intercept? (Hint: What happens to the scatterplot if you multiply Y by 5?). Verify your answer by calculation.

2.
Consider the data in Exercise 1.
(a)
Write down the regression equation for predicting Y from X.
(b)
If a sixth point were found, and this point had value X=4, what is the regression prediction of the Y-value?
(c)
Compute the Predicted Value for each of the five data points. Do the Predicted Values have the same average as Y?
(d)
Which of the five Y-values are higher than their Predicted Value?
(e)
Compute the Residual for each of the five data points. Do the Residuals have 0 average?
(f)
Compute SSE and SSTo. Is SSE smaller than SSTo?
(g)
Compute SSR and R2. How is R2 related to the correlation between X and Y?
(h)
What percentage of SST0 is 'explained' by X. What percentage of SSTo is not explained by X?

3.
Consider the data in Exercise 1.
(a)
Fill in the following ANOVA table (the P-value for F is optional):
``` ANOVA
df     SS        MS          F        Significance F
Regression  ___  _______   _______     _______       _______
Residual    ___  _______   _______
total       ___  _______
```

(b)
If a sixth point were found, and this point had value X=4, the Y-value is predicted to be _________ give or take _________.

4.
Consider the data of a random sample of records of resales of homes from Feb 15 to Apr 30, 1993 from the files maintained by the Albuquerque Board of Realtors. This type of data is collected by multiple listing agencies in many cities and is used by realtors as an information base.

```

PRICE = Selling price (in hundred dollars)
SQFT = Square feet of living space
AGE = Age of home (years)
FEATS = Number out of 11 features (dishwasher, refrigerator, microwave,
disposer, washer, intercom, skylight(s), compactor, dryer, handicap fit,
cable TV access
NE = Located in northeast sector of city (1) or not (0)
COR = Corner location (1) or not (0)
TAX = Annual taxes (dollars)

PRICE   SQFT    AGE  FEATS  NE CUST  COR   TAX
2050     2650    13    7    1    1    0    1639
2080     2600     *    4    1    1    0    1088
2150     2664     6    5    1    1    0    1193
2150     2921     3    6    1    1    0    1635
1999     2580     4    4    1    1    0    1732
1900     2580     4    4    1    0    0    1534
1800     2774     2    4    1    0    0    1765
1560     1920     1    5    1    1    0    1161
1450     2150     *    4    1    0    0    *
1449      710     1    3    1    1    0    1010
1375     1837     4    5    1    0    0    1191
1270     1880     8    6    1    0    0    930
1250     2150     5    3    1    0    0    984
1235     1894     4    5    1    1    0    1112
1170     1928     8    8    1    1    0    600
1180     1830     *    3    1    0    0    733
1155     1767    16    4    1    0    0    794
1110     1630    15    3    1    0    1    867
1139     1680    17    4    1    0    1    750
995      1725     *    3    1    0    0    923
995      1500    15    4    1    0    0    743
975      1430     *    3    1    0    0    752
975      1360     *    4    1    0    0    696
900      1400    16    2    1    0    1    731
960      1573    17    6    1    0    0    768
860      1385     *    2    1    0    0    653
```

(a)
Set up a scatter diagram of PRICE (X) vs. TAX (Y)
(b)
Assuming a linear relationship, use the least-squares method to find the regression coefficients
(c)
Interpret the meaning of the slope estimate.
(d)
Compute R-square
(e)
At the 0.05 level of significance, is there a significant linear relationship between PRICE and TAX?

Next: Index Up: Linear Regression Previous: Multiple Linear Regression

2003-09-08