A bowler is bragging that his average is ``at least
180''. We observe him play three games, his scores are
125, 155, 140 (
,
*S*=15). Should we accept or reject his claim?
*We should reject it*. Why?
*Because a sample average as low as 140 is unlikely from a 180 bowler.*
How unlikely? *A 180 bowler will bowl a 3-game
average of 140 or lower only 2 percent of the time.*
Is 2 percent of the time unlikely?
*In statistics, yes. 5 percent or less is called
statistically significant.*

The decision making process above is called a test of significance . Here is the way a statistical report would formally present the test, in numbered stages.

- 1. Hypotheses: versus
- 2. Test Statistic:
- 3. P-value: Presuming
*H*_{0}is true, the likelihood of chance variation yielding a*t*-statistic as low as -4.62 is .02. (Calculation details later.) - 4. Conclusion: Since P-value ,
the observed sample value
is declared
significantly unlikely under
.
Hence, we reject
*H*_{0}and conclude . The sample provides evidence to reject the bowler's claim.

Here is a more detailed description of each component of the test of significance above.

- 1.
- The null and alternative hypotheses
.

*H*_{0}and*H*_{1}are called the**null hypothesis**and**alternative hypothesis**, respectively. The two hypotheses describe the two possibilities: the claim is true ( ), or the claim is false (). Note that(i) the two hypotheses are statements about the population (ii) the two hypotheses are complementary; if one occurs the other does not (iii) the hypothesis with the equal sign is the null hypothesis

A test of significance rejects (population statement)*H*_{0}and concludes*H*_{1}if the*sample*values are ``significantly far from*H*_{0}and inside*H*_{1}''. Hence, we reject and conclude if is some significant distance below 180. How far below 180 is ``significant''? The test statistic helps us determine where to draw the line in the sand. - 2.
- The Test Statistic
For tests of hypotheses on ,
the
*t*-test statistic is a ratio of the form

For the null hypothesis , the*t*-test statistic is

*H*_{0}will be rejected if and only if will be some significant distance below 180, which happens if and only if*t*is some significant distance below 0. Based on the sample observed scores, the observed*t*value is

``Is*t*=-4.62 significantly below 0?'' To answer this, we will need the help of the*t*-curve with*n*-1 degrees of freedom. - 3.
- The P-value

Using the

*t*curve with*n*-1=2 degrees of freedom, the likelihood of chance variation resulting in a*t*-value as low as -4.62 is .02.Since this likelihood is less than .05 (the standard for statistical significance), we declare that ``

*t*=-4.62 is significantly below 0'', or that `` is significantly below 180'', and reject . In general, the P-value is the total area under the curve more extreme than*t*in support of*H*_{1}. If*t*is ``deep in*H*_{1}territory'', then the P-value is small. If P-value .05, we reject*H*_{0}with statistical significance. If P-value .01, we reject*H*_{0}with*high*statistical significance. If P-value is larger than .05, we accept*H*_{0}. - 4.
- Conclusion
If
*H*_{0}is rejected, the conclusion is usually stated as `there is enough evidence to ...' or `there are statistically significant differences...'. If*H*_{0}is accepted, the conclusion is usually stated as `there is not enough evidence to ...', or `there are no statistically significant differences...'. Since P-value=.02 in our example, we conclude that `the sample provides enough evidence to reject the bowler's claim of a 180 average'. Or `his performance ( ) was much lower than his claimed average (), and the difference is statistically significant.'

Summary of the **lower-tailed** *t*-test for :

- versus
- Test statistic:
- P-value: Total area
*less than**t*(the direction of*H*_{1}) under*t*-curve with*n*-1 degrees of freedom If*t*is significantly*below*0 (the direction of*H*_{1}), the P-value will be small. - Conclusion:
If P-value
.05, we reject
*H*_{0}with statistical significance. If P-value .01, we reject*H*_{0}with high statistical significance. If P-value >.05, we do not reject*H*_{0}.

- Upper-tailed
*t*-test - Two-tailed
*t*-test - The
*t*-test is not universally appropriate - Generalizing Tests of Significance
- Examples
- Type I and Type II Errors