Next: Calculating the Least Squares Up: Linear Regression Previous: Linear Regression

# Simple Linear Regression

The following table contains data on winning bid price for 12 Saturn cars on eBaY in July 2002. The car mileage is also given, and the cars have been arranged in increasing order of Miles.

                  Car     Miles    Price ($) 1 9300 7100 2 10565 15500 3 15000 4400 4 15000 4400 5 17764 5900 6 57000 4600 7 65940 8800 8 73676 2000 9 77006 2750 10 93739 2550 11 146088 960 12 153260 1025  Problem: Based on the data, how much do I expect to get for a Saturn car that has been driven 60000 miles? An initial analysis would go like this: "Car 7 has 65000 miles and has a bid of$8800. I should expect to get a little more for mine, maybe $9000(?). However, Car 6 only has 57000 miles, yet the high bid is only$4600. Based on this observation, I should expect to get a little less than $4600, maybe$4400 (?)." This type of ad hoc data analysis looks at a few observations (Cars 6 and 7) without considering the rest of the data.

Simple linear regression  is a data analysis technique that tries to find a linear pattern in the data. In linear regression, we use all of the data to calculate a straight line which may be used to predict Price based on Miles. Since Miles is used to predict Price, Miles is called an Explanatory Variable'    while Price is called a Response Variable'.    Table  11.1 shows a scatterplot of    Price (on the Y-axis) versus Miles (on the X-axis):

Notice that the points seem to fall around a straight line sloping downwards. Can you draw this line? We will discuss one way to do this, called the least squares (LS)  method. For now, suppose that the LS line has already been computed (we will do this later). The LS line overlayed on the scatterplot looks like Figure  11.2.

The formula for this line, in the form Y= a + bX, is

The slope   of the line is -.05127, which means that predicted Price tends to drop 5 cents for every additional mile driven, or about $512.70 for every 10,000 miles. The intercept (or Y-intercept) of the line is$8136; this should not be interpreted as the predicted price of a car with 0 mileage because the data provides information only for Saturn cars between 9,300 miles and 153,260 miles).

We can now use the line to predict   the selling price of a car with 60000 miles. What is the height or Y value of the line at X=60000? The answer is

or about $5000 or$5100 or so.

Next: Calculating the Least Squares Up: Linear Regression Previous: Linear Regression

2003-09-08