One formula for the Pearson correlation coefficient r is as follows:
The following numerical example shows how the formula ( 10.1) is used:
We present a second formula that is harder to compute but easier to interpret.
Consider the Ad Spending example at the start of this chapter.
Many of the (X, Y) points are simultaneously above average, since companies that
have higher than average Advertising Spending also have higher than average Impressions. Both
and
are positive for these companies.
Therefore, the product
is positive for these companies.
Most of the remaining companies have lower than average Spending and lower than average Impressions.
Both
and
are negative for these companies, but
the product
is still positive! Hence the numerator in
( 10.2) tends to be a large positive number for the Ad Spending data.
If the points were sloped downwards, then high X-values tend to go with low Y-values,
and the product
is negative for these points.
This is partly how the correlation formula ( 10.2) works.
The denominator terms have been put in to ensure that r does not go beyond -1 or +1.