Consider the following exercise.
Suppose that the management of a chain of package delivery stores would
like to develop a model for predicting the weekly sales (in thousands of
dollars) for individual stores based on the number of customers who made
purchases. A random sample of 20 stores was selected from among all the
stores in the chain. Since we wish to predict Sales with number of
Customers, that makes Sales the dependent, response, or "Y"
variable, and number of Customers is the independent, explanatory,
or "X" variable.
Customers Weekly Sales --------- ------------ 907 11.20 926 11.05 506 6.84 741 9.21 789 9.42 889 10.08 874 9.45 510 6.73 529 7.24 420 6.12 679 7.63 872 9.43 924 9.46 607 7.64 452 6.92 729 8.95 794 9.33 844 10.23 1010 11.77 621 7.41
Part (a) asks for a scatter diagram, part (b) asks for the regression
coefficients, part (c) asks for an interpretation of the slope, and part
(d) asks for the predicted value of Y when X=600. Parts (a) and (b)
require Excel, and part (d) can either be done by hand or with the
Enter the Package data set into Excel.
Choose Tools, | Data Analysis, | and then Regression.
Y Variable Cell Range is b1:b21, since Sales is the response variable.
X Variable Cell Range is a1:a21, since Customers is the explanatory variable.
Make sure the First cells in both ranges contain label box is checked.
Check the Residual plot box, even though this isn't asked for in this problem.
Check the Line Fit Plot box.
Many sheets will be created.
(a) Set up a scatter diagram.
Here is the scatter plot from Excel:
Notice the increasing relationship. As the number of Customers
increases, Sales increase.
Here is the regression output:
(b) Assuming a linear relationship, use the least-squares method to find
the regression coefficients b0 and b1.
From the output, we see that
b0 = 2.423, and
b1 = 0.00873.
(c) Interpret the meaning of the slope b1 in this problem.
As the number of Customers increases by 1, Sales
increases by $8.73. Remember that the Sales numbers are in thousands
(d) Predict the average weekly Sales (in thousands in dollars) for
stores that have 600 customers.
We can plug into the regression equation for this by hand:
So, the average weekly Sales for stores with 600 Customers
(g) How much variation in Sales is explained by number of
(h) What is the standard error of the estimated regression line?
Answer: s = .50150, or $501.50 (remember that s is always in Y units!)
(i) Based on the residual plot, does the linear fit look okay?
Note that the residual plot shows up on the SLR sheet, over to the right.
Answer: Yes, since there isn't any kind of obvious pattern here.
is there evidence of a linear relationship between
Sales and number of Customers?
Answer: We are testing vs.
Test statistic: t = 13.6462
Reject H0. There is evidence of a significant relationship.
(k) Give a 95% confidence interval for the true slope.
Answer: We can do this (partially) by hand: