The following table contains data on winning bid price for 12 Saturn cars on eBaY in July 2002. The car mileage is also given, and the cars have been arranged in increasing order of Miles.

Car Miles Price ($) 1 9300 7100 2 10565 15500 3 15000 4400 4 15000 4400 5 17764 5900 6 57000 4600 7 65940 8800 8 73676 2000 9 77006 2750 10 93739 2550 11 146088 960 12 153260 1025

An initial analysis would go like this: "Car 7 has 65000 miles and has a bid of $8800. I should expect to get a little more for mine, maybe $9000(?). However, Car 6 only has 57000 miles, yet the high bid is only $4600. Based on this observation, I should expect to get a little less than $4600, maybe $4400 (?)." This type of ad hoc data analysis looks at a few observations (Cars 6 and 7) without considering the rest of the data.Problem:Based on the data, how much do I expect to get for a Saturn car that has been driven 60000 miles?

Simple linear regression
is a data analysis technique that tries to find *a linear pattern
in the data*. In linear regression, we use all of the data to calculate a straight line
which may be used to predict Price based on Miles.
Since Miles is used to predict Price,
Miles is called an `Explanatory Variable'
while Price is called a `Response Variable'.
Table 11.1 shows a scatterplot of
Price (on the Y-axis) versus Miles (on the X-axis):

Notice that the points seem to fall around *a straight line* sloping downwards.
Can you draw this line? We will discuss one way to do this,
called the *least squares* (LS) method.
For now, suppose that the LS line has already been computed (we will
do this later).
The LS line overlayed on the scatterplot
looks like Figure 11.2.

The formula for this line, in the form *Y*= *a* + *bX*, is

The

We can now use the line to predict
the selling price of a car with 60000 miles. What is
the height or *Y* value of the line at *X*=60000? The answer is

or about $5000 or $5100 or so.