next up previous contents index
Next: Linear Regression Up: Correlation Previous: Computing the Pearson Correlation

Exercises



1.
Consider the following data:

X

Y
-2 0
2 3
5 10
-1 1
6 15

 

(a)
Compute the correlation between X and Y.
(b)
Compute the correlation between Y and X.
(c)
Add 5 to Y, so the new values are 5, 8, 15, 6, 20. Now compute the correlation between X and Y. Is the correlation smaller, larger, or the same as before?
(d)
Multiply Y by 5, so the new values are 0, 15, 50, 5, 75. Now compute the correlation between X and Y. Is the correlation smaller, larger, or the same as before?
(e)
Multiply Y by -1, so that the new values are 0, -3, -10, -1, -15. Now compute the correlation between X and Y. Is the correlation smaller, larger, or the same as before?
2.

The following data presents the per capita income of 20 European OECD countries for 1960 and as well as the percentages of the labor force employed in agriculture, industry, and services for each country.



PCINC = Per capita income , 1960 ($)
AGR = Percent of labor force in agriculture, 1960
IND = Percent of labor force in industry, 1960
SER = Percent of labor force in service occupations, 1960

COUNTRY

PCINC AGR IND SER

CANADA

1536 13 43 45
SWEDEN 1644 14 53 33
SWITZERLAND 1361 11 56 33
LUXEMBURG 1242 15 51 34
U. KINGDOM 1105 4 56 40
DENMARK 1049 18 45 37
W. GERMANY 1035 15 60 25
FRANCE 1013 20 44 36
BELGIUM 1005 6 52 42
NORWAY 977 20 49 32
ICELAND 839 25 47 29
NETHERLANDS 810 11 49 40
AUSTRIA 681 23 47 30
IRELAND 529 36 30 34
ITALY 504 27 46 28
JAPAN 344 33 35 32
GREECE 324 56 24 20
SPAIN 290 42 37 21
PORTUGAL 238 44 33 23
TURKEY 177 79 12 9

       

Using your TI-83 calculator, compute the correlation coefficient of the following pairs : PCINC vs. AGR, PCINC vs. IND and PCINC vs. INDSER.

(a)
Which among the three labor sectors provides the strongest linear relationship with per capita income?
(b)
If majority of the labor force works in agriculture, would you expect a higher per capita income?
(c)
Suppose PCINC (per capita income) is coded in thousands of dollars instead, what happens to the correlation coefficients?

3.
Consider the first and second exam scores of 35 Stat 216 students:

Student First Second Student First Second
1 21 22 19 25 22
2 23 23 20 13 19
3 16 19 21 17 22
4 23 19 22 23 18
5 23 24 23 11 21
6 17 21 24 17 14
7 12 18 25 18 11
8 15 16 26 13 16
9 20 20 27 18 11
10 8 10 28 16 15
11 22 24 29 21 17
12 22 22 30 15 9
13 23 22 31 16 22
14 18 19 32 22 16
15 22 23 33 18 16
16 20 20 34 21 13
17 20 20 35 19 24
18 20 20      
(a)
Draw a scatterplot of the data. How are the two exam scores related based on the plot? Would you say that this relationship is strong?
(b)
Compute the correlation coefficient between the first and second exam scores. Does this value support your judgment in the previous question?
(c)
The Stat 216 director decided to curve the first exam scores by giving away 5 points.
i.
Obtain a new scatterplot and compare this with the old plot.
ii.
What happens to the correlation coefficient? Explain this behavior.
(d)
Fifth and eleventh students were found cheating during the second exam. As a result, they were both given zeros in that exam. What will happen now to the correlation coefficient? Can you consider this new value reliable? Explain why.


next up previous contents index
Next: Linear Regression Up: Correlation Previous: Computing the Pearson Correlation

2003-09-08