next up previous contents index
Next: Interpreting the SD Up: Location and Spread Previous: Measures of Location

The Average and SD

Let $X_1, X_2, \ldots, X_n$ denote the data values. The average  or arithmetic mean  (denoted $\overline{X}$ in formulas) is computed as

\begin{displaymath}\overline{X}= \frac{X_1+X_2+\cdots +X_n}{n}.
\end{displaymath} (3.1)

The average, which necessarily falls somewhere in the middle of the data points, is a commonly used statistic to indicate the center or location of the data. But the data points are all spread out, so most points tend to miss the average by some amount. In the Pharmacia example, since none of the points equal the average $50.54, then every point in the data misses (or deviates from) the average by some amount. What is the average size of the deviation? To a large extent, this is what we call the standard deviation  or SD (which will be denoted by s in formulas). Let $\vert X_i-\overline{X}\vert$ denote the deviations from average (ignoring negative signs). Note that

\begin{displaymath}\mbox{ave}\{\vert X_i-\overline{X}\vert\} \doteq \sqrt{\mbox{ave}\{\vert X_i-\overline{X}\vert^2\}}
\end{displaymath}

or in formulas,

\begin{displaymath}\frac{\sum \vert X_i-\overline{X}\vert}{n}
\doteq \sqrt{ \frac{\sum \vert X_i-\overline{X}\vert^2}{n}}
\end{displaymath}

The SD is the second formula with the denominator n replaced by n-1,

 \begin{displaymath}
S= \sqrt{ \frac{\sum \vert X_i-\overline{X}\vert^2}{n-1}}
\end{displaymath} (3.2)

Why do we use the squareroot of squares instead of just the average of absolute values? Why do we replace n by n-1 in the denominator? The long answers are mathematically complicated; a short answer is ``because both adjustments give the statistic better mathematical properties''.

Here are the monthly Pharmacia stock prices , sorted from smallest to largest.

      39.60   40.52   40.56   42.65   44.40   44.62   45.95   48.56   
      49.93   50.37   51.68   51.70   51.93   52.26   54.75   55.00   
      56.02   58.56   60.18   61.00   61.00

Now, compute the deviations from average Xi-50.54.

     -10.94  -10.02   -9.98   -7.89   -6.14   -5.92   -4.59   -1.98   
      -0.61   -0.17    1.14    1.16    1.39    1.72    4.21    4.46    
       5.48    8.02    9.64   10.46   10.46

The average size of the deviations (ignoring negative signs) is

\begin{displaymath}(10.94 + 10.02 + 9.98 + \cdots + 10.46)/21= 5.54.
\end{displaymath}

The squareroot of the average of squares is

\begin{displaymath}\sqrt{(10.94^2 + 10.02^2 + 9.98^2 + \cdots + 10.46^2)/21}=6.66.
\end{displaymath}

The standard deviation is

\begin{displaymath}\sqrt{(10.94^2 + 10.02^2 + 9.98^2 + \cdots + 10.46^2)/20}=6.82.
\end{displaymath}



 
next up previous contents index
Next: Interpreting the SD Up: Location and Spread Previous: Measures of Location

2003-09-08