next up previous contents index
Next: Measures of Scale Up: Measures of Scale or Previous: Measures of Scale or

Motivation

It is easy to think of many data sets which have the same center but are quite different otherwise. For example, consider the following three data sets placed in columns 2 through 4 of the table:
        Samp.1 Samp.2  Samp.3

    1     88    119     91
    2    166    116     98
    3    143     92    117
    4    110     94     62
    5     86     86     51
    6    108     81     40
    7    133    133     57
    8    105     65     74
    9    114     82     65
   10    126     90     60
   11     87     86     26
   12     99     98     81
   13     72     58    133
   14     98    106    174
   15     73     99    134
   16    137    102    120
   17    109     93    119
   18     82    101    171
   19    122    100    132
   20    174    101     88
   21     65    126    154
   22     99    103    154
   23    109    142     94
   24    105    103    121
   25     79    105    131
__________________________

Median   105    100    98
Mean     108     99   102
Based on the sample medians and means (last two rows of the table), the center estimates are fairly similar for the data sets, considering the noise level. So if we would only estimate center it would be hard to tell these data sets apart. But in this class PLOT DATA is a must! Comparison boxplots yield:
                                    -------------
 1                           -------I     +     I-----------------
                                    -------------

                                     ------
 2                        *  *    ---I  + I------- *  *
                                     ------

                             -----------------------
 3              -------------I          +          I--------------
                             -----------------------
           ------+---------+---------+---------+---------+---------+C20
                30        60        90       120       150       180
By the length of the boxes (i.e. interquartile ranges), we see that the noise levels are quite different in the data sets. Sample 3 seems to be twice as noisy as Sample 2 and Sample 2 seems to be twice as noisy as Sample 1. So along with measures of center we need measures of noise. For the third data set the boxplot misses something very important. From the stem leaf plot the data appears to be bimodal. The other two data sets appear to be unimodal.

 Stem-and-leaf of Sample 1       N  = 25
 Leaf Unit = 1.0


     1    6 5
     4    7 239
     8    8 2678
    11    9 899
    (5)  10 55899
     9   11 04
     7   12 26
     5   13 37
     3   14 3
     2   15
     2   16 6
     1   17 4


 Stem-and-leaf of Sample 2       N  = 25
 Leaf Unit = 1.0


     1    5 8
     2    6 5
     2    7
     6    8 1266
    12    9 023489
    (8)  10 01123356
     5   11 69
     3   12 6
     2   13 3
     1   14 2


 Stem-and-leaf of Sample 3       N  = 25
 Leaf Unit = 10


     1    0 3
     3    0 45
     8    0 66677
    12    0 8999
    (1)   1 0
    12    1 22223333
     4    1 55
     2    1 77

Again: you must PLOT the data and it is best to use several different different types of plots. What do the comparison boxplots tell you (5 extra brownie points)?


next up previous contents index
Next: Measures of Scale Up: Measures of Scale or Previous: Measures of Scale or

2001-01-01