The five values (2.37, 2.72, 3.11, 3.52, 4.00) that divide the data into quarters and form the fences and whiskers of the boxplot are collectively called the five-number-summary of the data. They are often denoted as MIN, Q1, MED, Q3, and MAX respectively.
1. MIN is called the minimum , and is the smallest of the ordered observations.
2. Q1 is the upper boundary of the first quarter, and is called the first quartile .
3. MED is the upper boundary of the second quarter, and is called the second quartile. However, it also divides the data into lower and upper halves, and is more often called the median .
4. Q3 is the upper boundary of the third quarter, and is called the third quartile.
5. MAX is the largest of the ordered observations and is called the maximum .
Boxplots are quite useful for comparing two distributions side-by-side. Below, we present a boxplot of second year GPA alongside the boxplot of high school GPA. The boxplots are presented vertically this time, but the interpretation remains the same. Note that there is a slight difference in location as measured by the medians, but there is a radical difference in spread between the two distributions. The most noteworthy feature of 2nd year GPA is the long left tail, which is evidence that some students are not doing very well in college. (Note: Some computing packages use a special symbol to denote outlying values, or outliers. The boxplot for second year GPA has an extremely low outlier, denoted by a circle. The left (or bottom) whisker ends at the second smallest observation).
Different statistical computing packages often have different ways of computing the quartiles. In this class, we compute the quartiles as follows. First, arrange the observations from smallest (1st ordered observation) to largest (nth ordered observation). Then
Q1 is the .25(n+1)st ordered observation.If .25(n+1) is not an integer, take the average of the two adjacent ordered observations. Similarly for MED and Q3. Following are the 56 ordered observations of HS GPA used in the boxplot above.
MED is the .50(n+1)st ordered observation.
Q3 is the .75(n+1)st ordered observation.
2.37 2.43 2.46 2.55 2.57 2.58 2.59 2.60 2.60 2.60 2.61 2.63 2.67 2.71 2.73 2.75 2.78 2.78 2.78 2.79 2.81 2.81 2.82 2.90 2.91 2.93 2.94 3.08 3.14 3.16 3.19 3.20 3.21 3.29 3.32 3.33 3.35 3.36 3.36 3.36 3.44 3.50 3.54 3.54 3.57 3.58 3.60 3.62 3.72 3.73 3.76 3.76 3.77 3.83 3.86 4.00
Since .25(56+1)=14.25, then Q1 is computed as the average of the 14th and 15th ordered observations (2.71+2.73)/2=2.72. Similarly, MED=(3.08+3.14)/2 = 3.11, and Q3= (3.50+3.54)/2=3.52.