Is there an association between gender and height? Yes, males tend to be taller than females. A more formal way of saying this is `height distribution for males tends to be different from females'. Is there an association between shoe size and height? Yes, `height distribution for men who wear size 12 is different from those who wear size 8'. Is there an association betwen GPA and height? No, `height distribution tends to be the same for 3.0 students as well as 3.5 students.'
Two variables A and B are said to be associated if the distribution of B tends to change with the level of the A variable. Otherwise, they are said to be independent .
Therefore height is associated with gender and shoe size, but independent of GPA.
Now consider the following
table. 189 students entering a business
school program were followed as part of an attrition (i.e. drop out, transfer) study.
The students were cross classified according to 4 categories of high school GPA
(2.0-2.5, 2.5-3.0, 3.0-3.5, 3.5-4.0) and 3 categories of attrition outcomes
(`did not return for 2nd year', `returned for 2nd but not for 3rd year',
`returned for 3rd year'). Is there an association between
HS GPA and college attrition?
|
|
To analyze whether attrition and GPA are independent, we will analyze whether attrition distribution remains the same regardless of GPA level. Here is how a statistical report would present the analyis, in numbered stages.
H0: The two variables are independent versus H1: The two variables are associated.
If the two variables are independent, the observed frequencies should be distributed like these expected frequencies (calculations later):
| 2.0-3.5 | 2.5-3.0 | 3.0-3.5 | 3.5-4.0 | |
| No 2nd Year | 16.08 | 3.42 | 7.24 | 11.26 |
| No 3rd Year | 13.12 | 2.79 | 5.90 | 9.18 |
| Return for 3rd Year | 50.80 | 10.80 | 22.86 | 35.56 |
The chi-square test statistic measures whether the observed and expected frequencies are close:
| = |
|
+
|
+
|
+
|
|
| +
|
+
|
+
|
+
|
||
| +
|
+
|
+
|
+
|
||
| = | 23.42 |
The area greater than 23.42 under the chi-square curve with (3-1)(4-1)=6 degrees of freedom is .0007.
