- 1.
- Consider the following data:
X

Y -2 0 2 3 5 10 -1 1 6 15 - (a)
- Compute the correlation between Y and X.
- (b)
- Compute the mean and SD of X.
- (c)
- Compute the mean and SD of Y.
- (d)
- Compute the slope of the regression line for predicting Y from X.
- (e)
- Compute the intercept of the regression line for predicting Y from X.
- (f)
- Write down the regression equation for predicting Y from X.
- (g)
- Add 5 to Y, so the new values are 5, 8, 15, 6, 20. What do you think will happen to the slope and intercept? (Hint: What happens to the scatterplot if you add 5 to Y?). Verify your answer by calculation.
- (h)
- Multiply Y by 5, so the new values are 0, 15, 50, 5, 75. What do you think will happen to the slope and intercept? (Hint: What happens to the scatterplot if you multiply Y by 5?). Verify your answer by calculation.

- 2.
- Consider the data in Exercise 1.
- (a)
- Write down the regression equation for predicting Y from X.
- (b)
- If a sixth point were found, and this point had value X=4, what is the regression prediction of the Y-value?
- (c)
- Compute the Predicted Value for each of the five data points. Do the Predicted Values have the same average as Y?
- (d)
- Which of the five Y-values are higher than their Predicted Value?
- (e)
- Compute the Residual for each of the five data points. Do the Residuals have 0 average?
- (f)
- Compute SSE and SSTo. Is SSE smaller than SSTo?
- (g)
- Compute SSR and
*R*^{2}. How is*R*^{2}related to the correlation between X and Y? - (h)
- What percentage of SST0 is 'explained' by X. What percentage of SSTo is not explained by X?

- 3.
- Consider the data in Exercise 1.
- (a)
- Fill in the following ANOVA table (the P-value for F is optional):
ANOVA df SS MS F Significance F Regression ___ _______ _______ _______ _______ Residual ___ _______ _______ total ___ _______

- (b)
- If a sixth point were found, and this point had value X=4, the Y-value is predicted to be _________ give or take _________.

- 4.
- Consider the data of a random sample of records of resales of
homes from Feb 15 to Apr 30, 1993 from the files maintained
by the Albuquerque Board of Realtors. This type of data
is collected by multiple listing agencies in many cities
and is used by realtors as an information base.
PRICE = Selling price (in hundred dollars) SQFT = Square feet of living space AGE = Age of home (years) FEATS = Number out of 11 features (dishwasher, refrigerator, microwave, disposer, washer, intercom, skylight(s), compactor, dryer, handicap fit, cable TV access NE = Located in northeast sector of city (1) or not (0) COR = Corner location (1) or not (0) TAX = Annual taxes (dollars) PRICE SQFT AGE FEATS NE CUST COR TAX 2050 2650 13 7 1 1 0 1639 2080 2600 * 4 1 1 0 1088 2150 2664 6 5 1 1 0 1193 2150 2921 3 6 1 1 0 1635 1999 2580 4 4 1 1 0 1732 1900 2580 4 4 1 0 0 1534 1800 2774 2 4 1 0 0 1765 1560 1920 1 5 1 1 0 1161 1450 2150 * 4 1 0 0 * 1449 710 1 3 1 1 0 1010 1375 1837 4 5 1 0 0 1191 1270 1880 8 6 1 0 0 930 1250 2150 5 3 1 0 0 984 1235 1894 4 5 1 1 0 1112 1170 1928 8 8 1 1 0 600 1180 1830 * 3 1 0 0 733 1155 1767 16 4 1 0 0 794 1110 1630 15 3 1 0 1 867 1139 1680 17 4 1 0 1 750 995 1725 * 3 1 0 0 923 995 1500 15 4 1 0 0 743 975 1430 * 3 1 0 0 752 975 1360 * 4 1 0 0 696 900 1400 16 2 1 0 1 731 960 1573 17 6 1 0 0 768 860 1385 * 2 1 0 0 653

- (a)
- Set up a scatter diagram of PRICE (X) vs. TAX (Y)
- (b)
- Assuming a linear relationship, use the least-squares method to find the regression coefficients
- (c)
- Interpret the meaning of the slope estimate.
- (d)
- Compute R-square
- (e)
- At the 0.05 level of significance, is there a significant linear relationship between PRICE and TAX?