next up previous
Next: About this document ...

Hypothesis Testing

This page will contain examples of the following:
1.
Z-test for the mean
2.
Z-test for the proportion
3.
t-test for the mean (with summary statistics given)
4.
t-test for the mean (with "raw" data given)

1.
Consider Exercise 8.49, on page 506 of your textbook.
The policy of a particular bank branch is that its ATMs must be stocked with enough cash to satisfy customers making withdrawals over an entire weekend. At this branch the expected (i.e., population) average amount of money withdrawn from ATM machines per customer transaction over the weekend is $160 with an expected (i.e., population) standard deviation of $30. Suppose that a random sample of 36 customer transactions is examined and it is observed that the sample mean withdrawal is $172.
Note that the standard deviation that we are to use for this problem did not come from the sample. Therefore, this will be a Z-test, not a t-test. For this problem, we have $\sigma = 30$, n = 36, and $\overline{x} = 172$.
(a)
State the null and alternative hypothesis.
This is not very clear, but apparently we are to check if the average exceeds $160, which would mean the ATMs are not "stocked with enough cash". This is what we will try to show, so it should go in the alternative hypothesis. Therefore the hypotheses would be:
$H_{0}: \mu \leq 160$
$H_{1}: \mu > 160$
(b)
At the .05 level of significance, using the critical value approach to hypothesis testing, is there evidence to believe that the true average withdrawal is greater than $160?
Let's get the test statistic:
$Z = \frac{\overline{X} - \mu}{\sigma / \sqrt{n}} =
\frac{172 - 160}{30 / \sqrt{36}} = 2.4$
Set up the rejection region by drawing a Z-curve and shade the last 5% of the right tail. We need the Z critical value associated with this area, which is $Z_{\alpha} = Z_{.05} = 1.645$. Use the invNorm function with .95 as the argument, since, as your hand-drawn curve should clearly show, the area from $-\infty$ to Z.05 is $1 - \alpha = 1 - .05 = .95$. The test statistic falls into the rejection region, i.e., 2.4 > 1.645, therefore we reject H0. Yes, there is evidence that the average is more than $160.
(c)
At the .05 level of significance, using the p-value approach to hypothesis testing, is there evidence to believe that the true average withdrawal is greater than $160?
Now you should draw another Z-curve, this time shading the area to the right of the test statistic, 2.4. Alternatively, you could use the same curve you drew in (b) and shade the new area with a different color pen. This would make it clear that the area we wish to obtain will be less than $\alpha = .05$. The shaded area is of course the p-value, and we can get it with the normalcdf function. You have done this before (in Chapter 6) when you needed a probability. Now the probability we need is $\emph{p}-value = P(Z > 2.4) =
.0081975$. This p-value is smaller than $\alpha$, thus we reject H0.
(d)
Interpret the meaning of the p-value in this problem.
The probability of obtaining a sample whose mean is $172 or more when H0 is true is .0082.
(e)
Compare your conclusions in (b) and (c).
The conclusions are the same, of course.
(f)
Now let's re-do part (a) using the Z-Test function. Under STAT TESTS, choose Z-Test. We do not have a list of data for this problem, instead we have summary statistics from the textbook. Choose Stats. Enter the information for this problem as you see below.
With the cursor on Calculate, press ENTER.
This gives the same results as before. Note, however, that this method gives the p-value, and does not give the critical value.
Another useful option is Draw. Go back and choose Draw instead of Calculate. The graph should match what you drew by hand. Actually, for "extreme" test statistics, the shading will not show up on the graph. Imagine it as slightly off-screen to the right.

2.
Consider Exercise 8.71, on page 528 of your textbook.
The marketing manager for an automobile manufacturer is interested in determining the proportion of new compact-car owners who would have purchased a passenger-side inflatable air bag if it had been available for an additional cost of $300. The manager believes from previous information that the proportion is .30. Suppose that a survey of 200 new compact-car owners is selected and 79 indicate that they would have purchased the air bags.
Since this is a hypothesis test for the proportion, it will be a Z-test. For this problem, we have n = 200, and ps = 79/200 = .395.
(a)
At the .10 level of significance, is there evidence that the population proportion is different from .30?
The hypotheses here will be
H0: p = .30
$H_{1}: p \neq .30$
Let's get the test statistic:
$Z = \frac{p_{s} - p}{\sqrt{p(1-p) / n}} =
\frac{.395 - .30}{\sqrt{.30(.70) / 200}} = 2.93176$
Set up the rejection region by drawing a Z-curve and shade the most extreme 5% of both tails. We need the Z critical value associated with this, which is $\pm Z_{\alpha / 2} = \pm Z_{.10 / 2} = \pm Z_{.05} = \pm 1.645$. Again, use the invNorm function with .95 as the argument. The test statistic falls into the rejection region, i.e., 2.93176 > 1.645, therefore we reject H0. Yes, there is evidence that the population proportion is different from .30.
(b)
Compute the p-value and interpret its meaning.
Now you should draw another Z-curve, this time shading the area to the right (and left!) of the test statistic, $\pm 2.93176$. Of course, you could use the same curve you drew in (a) and shade the new area with a different color. This would make it easy to see that the area we wish to obtain will be less than $\alpha = .10$. The shaded area is the p-value, and we get it with the normalcdf function, like before. We have two identical shaded areas, so we calculate:
$\emph{p}-value = 2P(Z > 2.93176) = 2(.001685) = .00337$. This p-value is smaller than $\alpha$, thus we reject H0.
(c)
What is your answer to (a) if 70 new owners indicated that they would have purchased the air bags?
Now we have ps = 70/200 = .35. The test statistic changes to:
$Z = \frac{.35 - .30}{\sqrt{.30(.70) / 200}} = 1.543$
This test statistic does not fall into the rejection region, i.e., $1.543 \ngtr 1.645$, therefore we do not reject H0. No, there is not evidence that the population proportion is different from .30.
Now, the p-value should be larger than $\alpha = .10$. Indeed, $p-value = 2P(Z > 1.543) = 2(.061415) = .12283 \nless .10$. Do not reject H0.
(d)
Now let's re-do part (a) using the 1-PropZTest function. Under STAT TESTS, choose 1-PropZInt. Enter the information for this problem as you see below.
With the cursor on Calculate, press ENTER.
This gives the same results as before. Again note that this method gives the p-value, and does not give the critical value.
Again, go back and choose Draw. Once again, the shading will not show up on the graph.
Let's use the1-PropZTest function for part (c).
This time, we can actually see the test statistic on the screen!

3.
Consider Exercise 8.59, on page 518 of your textbook.
The director of admissions at a large university advises parents of incoming students about the cost of textbooks during a typical semester. A sample of 100 students enrolled in the university indicates a sample average cost of $315.40 with a sample standard deviation of $43.20.
Note that the standard deviation did come from the sample. Therefore, this will be a t-test, not a Z-test. For this problem, we have s = 43.20, n = 100, and $\overline{x} = 315.40$.
(a)
Using the .10 level of significance, is there evidence that the population average is above $300? We are asked if the data shows that the mean is greater than $300. This will be the alternative hypothesis. Thus the hypotheses here will be
$H_{0}: \mu \leq 300$
$H_{1}: \mu > 300$
Now let's get the test statistic:
$t = \frac{\overline{X} - \mu}{s / \sqrt{n}} =
\frac{315.40 - 300}{43.20 / \sqrt{100}} = 3.5648$
Set up the rejection region by drawing a t-curve (just draw a symmetric bell-shaped curve like usual) and shade the last 10% of the right tail. We need the t critical value associated with this, which is $t^{(n-1)}_{\alpha} = t^{(99)}_{.10} = 1.2902$. We get this t critical value with the EQUATION SOLVER, just like we did for t confidence intervals. Under the MATH menu, choose Solver. The method explained on the *confidence intervals example page* where we gave -X for the lower bound and X for the upper bound makes sense for confidence intervals and also for two-tailed hypothesis tests, but we may want to use a slightly different method here for general use. Enter in variables for the arguments like you see below; L is for Lower bound, U is for Upper bound, D is for Degrees of freedom, and A is for Area.
We want the value such that there is 10% of the area to the right of that value. Let's have the calculator solve for L, so enter zero on the first line for a "guess". The upper bound is $\infty$, so enter 1E99 for U. The degrees of freedom for this problem are n - 1 = 100 - 1 = 99, and we want the area between the lower and upper bound to be $\alpha = .10$. With the cursor on the L=0 line, press SOLVE (ALPHA ENTER). Remember that this calculation takes about 15-20 seconds.
We now see that $t^{(n-1)}_{\alpha} = t^{(99)}_{.10} = 1.2902$. The test statistic falls into the rejection region, i.e., 3.5648 > 1.2902, therefore we reject H0. Yes, there is evidence that the population average is above $300.
Now let's get the p-value.
Draw another t-curve, this time shading the area to the right of the test statistic, 3.5648. Again, you could use the same curve you drew in (a) and shade the new area with a different color. The shaded area is the p-value, and we get it with the tcdf function.
$\emph{p}-value = P(t > 3.5648) = .00028$
This p-value is smaller than $\alpha = .10$, thus we reject H0.
(b)
What is your answer in (a) if the standard deviation is $75 and the .05 level of significance is used?
Now, s = 75 and $\alpha = .05$. The test statistic will change to:
$t = \frac{\overline{X} - \mu}{s / \sqrt{n}} =
\frac{315.40 - 300}{75 / \sqrt{100}} = 2.0533$
The rejection region now has only 5% of the area shaded in the right tail. Go back to the EQUATION SOLVER and make the change to the area part of our equation.
Now $t^{(n-1)}_{\alpha} = t^{(99)}_{.05} = 1.6604$. The test statistic falls into the rejection region, i.e., 2.0533 > 1.6604, therefore we reject H0.
Now let's get the p-value.
$\emph{p}-value = P(t > 2.0533) = .02134$
This p-value is smaller than $\alpha = .05$, thus we reject H0.
(c)
What is your answer in (a) if the sample average is $305.11?
Now, $\overline{X} = 305.11$. The test statistic will change to:
$t = \frac{\overline{X} - \mu}{s / \sqrt{n}} =
\frac{305.11 - 300}{43.20 / \sqrt{100}} = 1.1829$
The rejection region will be the same as it was in part (a). Our new test statistic does not fall into the rejection region, i.e., $1.1829 \ngtr 1.2902$, therefore we do not reject H0.
Now let's get the p-value.
$\emph{p}-value = P(t > 1.1829) = .11984 \nless .10$
Do not reject H0.
(d)
Now let's re-do part (a) using the T-Test function. Under STAT TESTS, choose T-Test.
We do not have a list of data for this problem, instead we have summary statistics from the textbook. Choose Stats. Enter the information for this problem as you see below.
With the cursor on Calculate, press ENTER.
This gives the same results as before. Note again that this gives the p-value, but not the critical value.
T-Test also has the Draw option. Go back and choose Draw instead of Calculate.
Like before, our test statistic is too extreme to be shown on the screen. Imagine it off-screen to the right.

4.
Consider Exercise 8.63, on page 519 of your textbook.
A manufacturer claims that the average capacity of a certain type of battery the company produces is at least 140 ampere-hours. An independent consumer protection agency wishes to test the credibility of the manufacturer's claim and measures the capacity of 20 batteries from a recently produced batch. The results, in ampere-hours, are as follows:
137.4  140.0  138.8  139.1  144.4  139.2  141.8  137.3  133.5  138.2
141.1  139.7  136.7  136.3  135.6  138.0  140.9  140.6  136.7  134.1
We were not given a mean or standard deviation; we'll have to get them ourselves from the data. Of course, the standard deviation we get will be a sample standard deviation, which makes this a t-test, not a Z-test. Enter the data into your calculator, into L1, say. Obtain the summary statistics from STAT CALC 1-Var Stats.
So for this data, we have $\overline{x} = 138.47$, s = 2.6589, and n = 20.
(a)
Using the .05 level of significance, is there evidence that the manufacturer's claim is being overstated?
The claim is that the average capacity is at least 140, and we will try to show that it is in fact less than 140. Thus the hypotheses here will be
$H_{0}: \mu \geq 140$
$H_{1}: \mu < 140$
Now let's get the test statistic:
$t = \frac{\overline{X} - \mu}{s / \sqrt{n}} =
\frac{138.47 - 140}{2.6589 / \sqrt{20}} = -2.5734$
Set up the rejection region by drawing a t-curve and shade the leftmost 5% of the left tail. We need the t critical value associated with this, which is $-t^{(n-1)}_{\alpha} = -t^{(19)}_{.05} = -1.7291$. We get this t critical value with the EQUATION SOLVER, just like we did for the last problem.
The test statistic falls into the rejection region, i.e., -2.5734 < -1.7291, therefore we reject H0. Yes, there is evidence that the manufacturer's claim is overstated.
Now let's get the p-value.
Draw another t-curve, this time shading the area to the left of the test statistic, -2.5734. Again, you could use the same curve you drew in (a) and shade the new area with a different color. The shaded area is the p-value, and we get it with the tcdf function.
$\emph{p}-value = P(t < -2.5734) = .00931$
This p-value is smaller than $\alpha = .05$, thus we reject H0.
(b)
What assumption must hold in order to perform the test in (a)?
The population of battery capacities must be (approximately) normally distributed.
(c)
Evaluate this assumption through a graphical approach.
Okay, let's see how likely it is that this data came from a normal population. First, let's look at a boxplot.
This is perfectly symmetric! Looks good so far.
Now let's look at a normal probability plot for this data.
This is a very straight line. Both the boxplot and the normal probability plot seem to be telling us that our data does indeed come from a normal population.
(d)
What is your answer in (a) if the last two values are 146.7 and 144.1 instead of 136.7 and 134.1?
Go to STAT Edit and change the last two values.
Obtain the summary statistics again. The mean and standard deviation will be different.
The test statistic changes to:
$t = \frac{\overline{X} - \mu}{s / \sqrt{n}} =
\frac{139.32 - 140}{3.0113 / \sqrt{20}} = -1.0099$
The rejection region stays the same, so we see that this time, the test statistic does not fall into the rejection region, i.e., $-1.0099 \nless -1.7291$. Therefore, we do not reject H0.
For the p-value, shade the area under the t-curve to the left of the test statistic, -1.0099.
$\emph{p}-value = P(t < -1.0099) = .16262 \nless .05$
Do not reject H0.
(e)
Now let's re-do part (a) using the T-Test function. (We will have to change the data back to what it was originally in part (a) first.) We do have a list of data for this problem, so choose Data. Enter the information for this problem as you see below.
With the cursor on Calculate, press ENTER. This gives the same results as before.
Again, go back and choose Draw. For this one, the shading is visible.



 
next up previous
Next: About this document ...

2000-10-19