next up previous contents index
Next: Calculating the Table of Up: Testing Equality of Frequencies Previous: Testing Equality of Frequencies

   
Testing for Independence between Two Variables Using an $r \times c$ Table

Is there an association  between gender and height? Yes, males tend to be taller than females. A more formal way of saying this is `height distribution for males tends to be different from females'. Is there an association between shoe size and height? Yes, `height distribution for men who wear size 12 is different from those who wear size 8'. Is there an association betwen GPA and height? No, `height distribution tends to be the same for 3.0 students as well as 3.5 students.'

Two variables A and B are said to be associated  if the distribution of B tends to change with the level of the A variable. Otherwise, they are said to be independent .

Therefore height is associated with gender and shoe size, but independent of GPA.

Now consider the following $3\times 4$ table. 189 students entering a business school program were followed as part of an attrition (i.e. drop out, transfer) study. The students were cross classified according to 4 categories of high school GPA (2.0-2.5, 2.5-3.0, 3.0-3.5, 3.5-4.0) and 3 categories of attrition outcomes (`did not return for 2nd year', `returned for 2nd but not for 3rd year', `returned for 3rd year'). Is there an association between HS GPA and college attrition?


 
Table 9.1: Retention versus HS GPA
 



  2.0-2.5 2.5-3.0 3.0-3.5 3.5-4.0
No 2nd Year 25 3 4 6
No 3rd Year 14 7 4 6
Return for 3rd Year 41 7 28 44

To analyze whether attrition and GPA are independent, we will analyze whether attrition distribution remains the same regardless of GPA level. Here is how a statistical report would present the analyis, in numbered stages.

1.
Hypotheses

H0: The two variables are independent versus H1: The two variables are associated.

2.
Test Statistic 

If the two variables are independent, the observed frequencies should be distributed like these expected frequencies  (calculations later):



  2.0-3.5 2.5-3.0 3.0-3.5 3.5-4.0
No 2nd Year 16.08 3.42 7.24 11.26
No 3rd Year 13.12 2.79 5.90 9.18
Return for 3rd Year 50.80 10.80 22.86 35.56

The chi-square test statistic measures whether the observed  and expected frequencies are close:

$\chi^2$ = $\frac{(25-16.08)^2}{16.08}$ + $\frac{(3-3.42)^2}{3.42}$ + $\frac{(4-7.24)^2}{7.24}$ + $\frac{(6-11.26)^2}{11.26}$
    + $\frac{(14-13.12)^2}{13.12}$ + $\frac{(7-2.79)^2}{2.79}$ + $\frac{(4-5.90)^2}{5.90}$ + $\frac{(6-9.18)^2}{9.18}$
    + $\frac{(41-50.80)^2}{50.80}$ + $\frac{(7 - 10.80)^2}{10.80}$ + $\frac{(28-22.86)^2}{22.86}$ + $\frac{(44-35.56)^2}{35.56}$
  = 23.42      

3.
P-value:

The area greater than 23.42 under the chi-square curve  with (3-1)(4-1)=6 degrees of freedom is .0007.      

\epsfig{file=anniechi26at2342.ps, height=4in, width=3in, angle=-90}

4.
Conclusion: Since P-value < .05, we reject H0 and conclude with very strong evidence that GPA and attrition are associated. In particular, low HS GPA's are associated with higher attrition rates.



 
next up previous contents index
Next: Calculating the Table of Up: Testing Equality of Frequencies Previous: Testing Equality of Frequencies

2003-09-08