Is there "grade inflation" in WMU? How does the average GPA of WMU students today compare with, say 10, years ago? Suppose a random sample of 100 student records from 10 years ago yields a sample average GPA of 2.90 with a standard deviation of .40. A random sample of 100 current students today yields a sample average of 2.98 with a standard deviation of .45. The difference between the two sample means is 2.98-2.90 = .08. Is this proof that GPA's are higher today than 10 years ago? Well....first we need to account for the fact that 2.98 and 2.90 are not the true averages, but are computed from random samples. Therefore, .08 is not the true difference, but simply an estimate of the true difference. Can this estimate miss by much? Fortunately, statistics has a way of measuring the expected size of the ``miss'' (or error of estimation) . For our example, it is .06 (we show how to calculate this later). Therefore, we can state the bottom line of the study as follows: "The average GPA of WMU students today is .08 higher than 10 years ago, give or take .06 or so."
We now show how to calculate the .06, the standard error of the
estimate. But first, a note on terminology. The estimate .08=2.98-2.90 is
a difference between averages (or means) of two independent random samples.
"Independent" refers to the sampling luck-of-the-draw:
the luck of the second sample is unaffected by the first sample.
In other words, there were two independent chances to have gotten lucky
or unlucky with the sampling.
The likely size of the error of estimation in the .08 is called
the standard error of the difference between independent means.
We calculate it using the following formula:
Note that
and
are the SE's of
and
,
respectively.
The formula looks easier without the notation and the subscripts.
2.98 is a sample mean, and has standard error
(since
SE=
). Similarly, 2.90 is a sample mean and has standard error
.
Summarizing, we write the two mean estimates
(and their SE's in parentheses) as
2.98 (SE=.045)If two independent estimates are subtracted, the formula ( 7.6) shows how to compute the SE of the difference :
2.90 (SE=.040)
2.98 - 2.90 (SE=or .08)
Remember the Pythagorean Theorem in geometry?
Think of the two SE's as the length of the two sides
of the triangle (call them a and b). The SE of the difference
then equals the length of the hypotenuse (SE of difference =
).
We are now ready to state a confidence interval for the difference between two independent means.
The correct z critical value for a 95% confidence interval is z=1.96. Therefore
a 95% z-confidence interval for
is
There is a second procedure that is preferable when either n1 or n2 or both are small. However, this method needs additional requirements to be satisfied (at least approximately):
Let Sp denote a ``pooled'' estimate of the common SD, as follows:
Requirement R1: Both samples follow a normal-shaped histogram
Requirement R2: The population SD'sand
are equal.
Returning to the grade inflation example, the pooled SD is
Note that the t-confidence interval ( 7.8) with pooled SD looks like the z-confidence interval ( 7.7), except that S1 and S2 are replaced by Sp, and z is replaced by t. We present a summary of the situations under which each method is recommended.
| R1 and R2 are both satisfied | R1 or R2 or both not satisfied | |
| Both samples are large | Use z or t | Use z |
| One or both samples small | Use t | Consult a statistician |