This chapter is a continuation of the last chapter. As in the last chapter, we have two populations X and Y but we now want to estimate the difference between the populations. We do need to make one important assumption:
Fortunately we have at least visual checks for this assumption in comparison
dotplots, boxplots and back-to-back stem leaf plots based on the samples
we obtain. For example, if the lengths of the boxes in the comparison boxplots
are much different then this is an indication that scale (or noise) level
is also different between the populations. Or, if, provided the sample
sizes are large enough, the shapes of the back-to-back stem leaf plots
are quite different then this would indicate that the populations differ
by more than a shift in locations.
Under this assumption, the problem is easily parameterized. Let
be the difference in locations of the populations. In many problems,
we think of
as the effect between the populations.
If
is the mean of the first population and
is the mean of the second population then
.
But
is also the difference in population medians, shift is shift. Hence, if
is the median of the first population and
is the median of the second population then
.
So we want to estimate
and we will be done. What's that? Louder, I can't hear you. Right! We must
also estimate the error of estimation. We want a confidence interval for
, too. How much did our estimate of
miss
by?
One final word. The value to check for in the confidence interval is 0. For if 0 is in the confidence interval then there may be no difference between the populations. Note this is another way of testing for a difference between populations. In particular, consider the hypotheses: