next up previous contents index
Next: Calculating the Least Squares Up: Linear Regression Previous: Linear Regression

Simple Linear Regression

The following table contains data on winning bid price for 12 Saturn cars on eBaY in July 2002. The car mileage is also given, and the cars have been arranged in increasing order of Miles.

                  Car     Miles    Price ($)

                   1       9300      7100
                   2      10565     15500
                   3      15000      4400
                   4      15000      4400
                   5      17764      5900
                   6      57000      4600
                   7      65940      8800
                   8      73676      2000
                   9      77006      2750
                  10      93739      2550
                  11     146088       960
                  12     153260      1025

Problem: Based on the data, how much do I expect to get for a Saturn car that has been driven 60000 miles?
An initial analysis would go like this: "Car 7 has 65000 miles and has a bid of $8800. I should expect to get a little more for mine, maybe $9000(?). However, Car 6 only has 57000 miles, yet the high bid is only $4600. Based on this observation, I should expect to get a little less than $4600, maybe $4400 (?)." This type of ad hoc data analysis looks at a few observations (Cars 6 and 7) without considering the rest of the data.

Simple linear regression  is a data analysis technique that tries to find a linear pattern in the data. In linear regression, we use all of the data to calculate a straight line which may be used to predict Price based on Miles. Since Miles is used to predict Price, Miles is called an `Explanatory Variable'    while Price is called a `Response Variable'.    Table  11.1 shows a scatterplot of    Price (on the Y-axis) versus Miles (on the X-axis):

Figure 11.1: Scatterplot of Price versus Miles
\epsfig{file=choknewmile_price.epsf, width=5in}\end{center}\end{figure}

Notice that the points seem to fall around a straight line sloping downwards. Can you draw this line? We will discuss one way to do this, called the least squares (LS)  method. For now, suppose that the LS line has already been computed (we will do this later). The LS line overlayed on the scatterplot looks like Figure  11.2.

Figure 11.2: Least Squares (LS) regression line overlayed on scatterplot
\epsfig{, width=5in, angle=-90}\end{center}\end{figure}

The formula for this line, in the form Y= a + bX, is

\begin{displaymath}\mbox{PREDICTED PRICE} = \$8136 - .05127 (\mbox{MILES})

The slope   of the line is -.05127, which means that predicted Price tends to drop 5 cents for every additional mile driven, or about $512.70 for every 10,000 miles. The intercept (or Y-intercept)    of the line is $8136; this should not be interpreted as the predicted price of a car with 0 mileage because the data provides information only for Saturn cars between 9,300 miles and 153,260 miles).

We can now use the line to predict   the selling price of a car with 60000 miles. What is the height or Y value of the line at X=60000? The answer is

\begin{displaymath}\mbox{PREDICTED PRICE} = \$8136 - .05127 (60000) = \$5059.80,

or about $5000 or $5100 or so.

next up previous contents index
Next: Calculating the Least Squares Up: Linear Regression Previous: Linear Regression