As we said, discrete data have natural categories. Hence to describe a discrete data set, simply classify the data into their categories. For example, suppose we ask our 20 students their stronger hand; i.e., whether they are left (L) or right (R) handed. The responses are:

Hand L R R R R L R R R R R R R R L R R R R R

Hence this is discrete data with two categories R or L. Classifying the data, we obtain

R L 17 3

This is the **distribution** of the data. It is indeed **the ** distribution,
there is no other.

A picture of the sample distribution is given in Figure 1.1:

Note how informative this picture is. It tells you immediately that there
are many more right-handed people in the sample than left-handed. More
than 5 times as many. This picture is much more informative than the 20
L's and R's listed above.

One ** ** of interest here is the **sample proportion**
of left-handers in the sample which is 3/20 or .15 (15%). Later in the
course, we will discuss how to use the sample proportion to **estimate**
the true proportion of left-handed people in the university (population).

- 1.
- Obtain the distribution of hair color for the above 20 students. Then draw a histogram of it by hand. Obtain the sample proportion of blonds.
- 2.
- Obtain the distribution of hair color for the above 20 students using the summary module.
- 3.
- Sometime ago, Carrie had a deck of 59 baseball cards. The data recorded from this deck is given in Appendix A. The fourth column of this data gives the stronger hand of the baseball player, 0 for right-handed and 1 for left-handed. Obtain the distribution of the strong hand of a baseball player and obtain a histogram of it by hand.
- 4.
- Repeat the exercise using the summary module to draw the histogram.
- 5.
- Note that about 11% of the males in America are left-handed. Obtain the sample proportion of left-handed baseball players. Does it seem high compared to 11%? If so, can you think of a reason why it would be high?
- 6.
- Obtain the distribution and obtain the proportion of ones in the following sample.
Data 1 1 1 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 0 1 1 1 0 0 0 1 0 1 0 0 1 0