next up previous contents
Next: Test of Independence for Up: Goodness-of-Fit Tests Previous: Goodness-of-Fit Tests

   
Chi-Squared Goodness of Fit Test

Recall that a discrete probability model consists of the range of possible values and the corresponding probabilities of those values. In practice we often have a hypothesis of interest concerning these probabilities.

For example, suppose we have a 6-sided die. The hypothesis of interest is that the die is fair; i.e, the corresponding probabilities are all 1/6. If we let p1 be the probability that the upface is a 1, p2 be the probability that the upface is a 2, et cetra, our hypotheses are:

\begin{displaymath}\mbox{$H_0:\;$\space all the $p$ 's are $\frac{1}{6}$\space v...
...:$\space at least one of the $p$ 's is not $\frac{1}{6}$ } \;.
\end{displaymath}

To test this hypothesis, we roll the die many times and obtain a sample frequency of outcomes; i.e., the observed number of 1's (O1), the observed number of 2's (O2), et cetra. We then compare this with what we expect under H0: the expected number of 1's (E1), the expected number of 2's (E2), et cetra. The averaged squared (standardized) deviations from what we expect is our test statistic:

 \begin{displaymath}
\chi^2 = \mbox{Sum}\left\{\frac{(O - E)^2}{E} \right\} \;.
\end{displaymath} (1)

We will employ a simple decision rule.

 \begin{displaymath}
\mbox{Reject $H_0$\space in favor of $H_a$\space if $\chi^2 \ge 1.645\sqrt{2(k-1)} + k-1$ }\;,
\end{displaymath} (2)

where k is the number of categories.

Example 0.0.1   Consider the fair 6-sided die discussed above. Suppose the observations on 600 rolls results in

  1 2 3 4 5 6 Total
Observed Frequency O 92 108 78 97 124 101 600
Do you think the die is fair based on this data? Just look at that 78 3's, 22 below what we expect; How about that 124? Stand back, let $\chi^2$ get to work. We of course expect 100 of each upface if the die is fair (i.e., if H0 is true). So

\begin{displaymath}\chi^2 = \frac{(92-100)^2}{100}+
\frac{(108-100)^2}{100}+ \fr...
...+
\frac{(124-100)^2}{100}+\frac{(101-100)^2}{100} = 11.980 \;.
\end{displaymath}

Since k = 6 our critical value is:

\begin{displaymath}1.645\sqrt{2(k-1)} + k-1 = 1.645\sqrt{2(k-1)} + k-1 = 10.2019 \;.
\end{displaymath}

Because $\chi^2 =11.980 \geq 10.2019$, we would reject H0 if favor of HA, concluding that the die is not fair.

The hypothesis H0 is often called an overall (omnibus) hypothesis. Upon rejecting, we often consider separate confidence intervals for the categorical probabilities. Recall the formula for the confidence interval for a proportion:

\begin{displaymath}\widehat{p} \pm 1.96 \sqrt{\frac{\widehat{p}(1-\widehat{p})}{n}} \;.
\end{displaymath}

For instance, the confidence interval for p1 is ( $\widehat{p}_1 =
92/600 = .153$):

\begin{displaymath}.153 \pm 1.96 \sqrt{\frac{.153(1-.153)}{600}} \;.
\end{displaymath}

This is (.1245, .1821) which traps 1/6 = .1666. In the exercises, you will be asked to obtain the confidence intervals for some of the remaining proportions.




EXERCISES

0.0.1   Obtain the confidence for the proportions p2 and p3 for the above example. Did the intervals trap 1/6.


next up previous contents
Next: Test of Independence for Up: Goodness-of-Fit Tests Previous: Goodness-of-Fit Tests
Stat 160
2002-04-12