Consider the following exercise.

Suppose that the management of a chain of package delivery stores would
like to develop a model for predicting the weekly sales (in thousands of
dollars) for individual stores based on the number of customers who made
purchases. A random sample of 20 stores was selected from among all the
stores in the chain. Since we wish to predict *Sales* with number of
*Customers*, that makes *Sales* the dependent, response, or "Y"
variable, and number of *Customers* is the independent, explanatory,
or "X" variable.

Customers Weekly Sales --------- ------------ 907 11.20 926 11.05 506 6.84 741 9.21 789 9.42 889 10.08 874 9.45 510 6.73 529 7.24 420 6.12 679 7.63 872 9.43 924 9.46 607 7.64 452 6.92 729 8.95 794 9.33 844 10.23 1010 11.77 621 7.41

Part (a) asks for a scatter diagram, part (b) asks for the regression
coefficients, part (c) asks for an interpretation of the slope, and part
(d) asks for the predicted value of Y when X=600. Parts (a) and (b)
require Excel, and part (d) can either be done by hand or with the
computer.

Enter the *Package* data set into Excel.

Choose **Tools**, | **Data Analysis**, | and then **Regression**.

**Y Variable Cell Range** is b1:b21, since *Sales* is the response variable.

**X Variable Cell Range** is a1:a21, since *Customers* is the
explanatory variable.

Make sure the **First cells in both ranges contain label** box is
checked.

Check the **Residual plot** box, even though this isn't asked for in
this problem.

Check the **Line Fit Plot** box.

Click `OK`

.

(a) Set up a scatter diagram.

Here is the scatter plot from Excel:

Notice the increasing relationship. As the number of *Customers*
increases, *Sales* increase.

Here is the regression output:

(b) Assuming a linear relationship, use the least-squares method to find
the regression coefficients *b*_{0} and *b*_{1}.

From the output, we see that
*b*_{0} = 2.423, and
*b*_{1} = 0.00873.

(c) Interpret the meaning of the slope *b*_{1} in this problem.

As the number of *Customers* increases by 1, *Sales*
increases by $8.73. Remember that the *Sales* numbers are in thousands
of dollars.

(d) Predict the average weekly *Sales* (in thousands in dollars) for
stores that have 600 customers.

We can plug into the regression equation for this by hand:

So, the average weekly *Sales* for stores with 600 *Customers*
is $7,661.

(g) How much variation in *Sales* is explained by number of
*Customers*?

Answer:

(h) What is the standard error of the estimated regression line?

Answer:
*s* = .50150, or $501.50 (remember that *s* is always in
*Y* units!)

(i) Based on the residual plot, does the linear fit look okay?

Note that the residual plot shows up on the **SLR** sheet, over
to the right.

Answer: Yes, since there isn't any kind of obvious pattern here.

(j) Using
,
is there evidence of a linear relationship between
*Sales* and number of *Customers*?

Answer: We are testing
vs.

Test statistic:
*t* = 13.6462

*p*-value

Reject *H*_{0}. There is evidence of a significant relationship.

(k) Give a 95% confidence interval for the true slope.

Answer: We can do this (partially) by hand:

(.00739, .01007)