Next: Comparing Averages of Two Up: Confidence Intervals Previous: Determining Sample Size for

Comparing the Averages of Two Independent Samples

Is there "grade inflation" in WMU? How does the average GPA of WMU students today compare with, say 10, years ago? Suppose a random sample of 100 student records from 10 years ago yields a sample average GPA of 2.90 with a standard deviation of .40. A random sample of 100 current students today yields a sample average of 2.98 with a standard deviation of .45. The difference between the two sample means is 2.98-2.90 = .08. Is this proof that GPA's are higher today than 10 years ago? Well....first we need to account for the fact that 2.98 and 2.90 are not the true averages, but are computed from random samples. Therefore, .08 is not the true difference, but simply an estimate of the true difference. Can this estimate miss by much? Fortunately, statistics has a way of measuring the expected size of the miss'' (or error of estimation) . For our example, it is .06 (we show how to calculate this later). Therefore, we can state the bottom line of the study as follows: "The average GPA of WMU students today is .08 higher than 10 years ago, give or take .06 or so."

We now show how to calculate the .06, the standard error of the estimate. But first, a note on terminology. The estimate .08=2.98-2.90 is a difference between averages (or means) of two independent random samples. "Independent" refers to the sampling luck-of-the-draw: the luck of the second sample is unaffected by the first sample. In other words, there were two independent chances to have gotten lucky or unlucky with the sampling. The likely size of the error of estimation in the .08 is called the standard error of the difference between independent means. We calculate it using the following formula:

 (7.4)

where and .

Note that and are the SE's of and , respectively. The formula looks easier without the notation and the subscripts. 2.98 is a sample mean, and has standard error (since SE= ). Similarly, 2.90 is a sample mean and has standard error . Summarizing, we write the two mean estimates (and their SE's in parentheses) as

2.98 (SE=.045)
2.90 (SE=.040)
If two independent estimates are subtracted, the formula ( 7.6) shows how to compute the SE of the difference  :
2.98 - 2.90 (SE= )
or .08 .06.

Remember the Pythagorean Theorem in geometry? Think of the two SE's as the length of the two sides of the triangle (call them a and b). The SE of the difference then equals the length of the hypotenuse (SE of difference = ).

We are now ready to state a confidence interval for the difference between two independent means.

The correct z critical value for a 95% confidence interval is z=1.96. Therefore a 95% z-confidence interval for is

or (-.04, .20).

There is a second procedure that is preferable when either n1 or n2 or both are small. However, this method needs additional requirements to be satisfied (at least approximately):

Requirement R1: Both samples follow a normal-shaped histogram
Requirement R2: The population SD's and are equal.

Let Sp denote a pooled''  estimate of the common SD, as follows:

The following confidence interval is called a Pooled SD'' or Pooled Variance'' confidence interval.

Returning to the grade inflation example, the pooled SD is

Therefore, , , and the difference between means is estimated as

where the second term is the standard error. For a 95% confidence interval, the appropriate value from the t curve with 198 degrees of freedom is 1.96. Therefore a t-confidence interval for with confidence level .95 is

or (-.04, .20).

Note that the t-confidence interval ( 7.8) with pooled SD looks like the z-confidence interval ( 7.7), except that S1 and S2 are replaced by Sp, and z is replaced by t. We present a summary of the situations under which each method is recommended.

 R1 and R2 are both satisfied R1 or R2 or both not satisfied Both samples are large Use z or t Use z One or both samples small Use t Consult a statistician

Next: Comparing Averages of Two Up: Confidence Intervals Previous: Determining Sample Size for

2003-09-08