Measures of Dispersion . As described in the previous section, the mean is the measure of
central tendency most commonly used in contract pricing. Though the mean for a data set is a
value around which the other values tend to cluster, it conveys no indication of the closeness of
the clustering (that is, the dispersion). All observations could be close to the mean or far away.
If you want an indication of how closely these other values are clustered around the mean, you
must look beyond measures of central tendency to measures of dispersion. This section will
examine:
• Several measures of absolute dispersion commonly used to describe the variation within a
data set — the range, mean absolute deviation, variance, and standard deviation.
• One measure of relative dispersion– the coefficient of variation.
Assume that you have the following scrap rate data for two contractor departments:
Contractor Scrap Rate Data
Month Dept. A, Fabrication Dept. B, Assembly
February .065 .050
March .035 .048
April .042 .052
May .058 .053
June .032 .048
July .068 .049
Total .300 .300
Mean .050 .050
The mean scrap rate for both departments is the same — 5 percent. However, the monthly scrap
rates in Department B show less variation (dispersion) around the mean. As a result, you would
probably feel more comfortable forecasting a scrap rate of 5 percent for Department B than you
would for Department A.
Differences in dispersion will not always be so obvious. The remainder of this section will
demonstrate how you can quantify dispersion using the five measures identified above.
Calculating the Range . Probably the quickest and easiest measure of dispersion to calculate is
the range. The range of a set of data is the difference between the highest and lowest observed
values. The higher the range, the greater the amount of variation in a data set.
R = H – L
Where:
R = Range
H = Highest observed value in the data set
L = Lowest observed value in the data set
Calculating the Range for the Scrap-Rate Example . By comparing the range for Department A
scrap-rate data with the range for Department B, you can easily determine that the historical data
from Department A shows greater dispersion.
Mean Absolute Deviation . The mean absolute deviation (MAD) is the average absolute
difference between the observed values and the arithmetic mean (average) for all values in the
data set.
If you subtracted the mean from each observation, some answers would be positive and some
negative. The sum of all the deviations (differences) will always be zero. That tells you nothing
about how far the average observation is from the mean. To make that computation, you can use
the absolute difference between each observation and the mean. An absolute difference is the
difference without consideration of sign and all absolute values are written as positive numbers.
If the average is eight and the observed value is six, the calculated difference is a negative two (6
- 8 = -2), but the absolute difference would be two (without the negative sign).
Note : Absolute values are identified using a vertical line before and after the value (e.g., |X|
identifies the absolute value of X).
To compute the MAD, use the following 5-step process:
Step 1 . Calculate the arithmetic mean of the data set.
Step 2 . Calculate the deviation (difference) between each observation and the mean of the data
set.
Step 3 . Convert each deviation to its absolute value (i.e., its value without considering the sign
of the deviation).
Step 4 . Sum the absolute deviations.
Step 5 . Divide the total absolute deviation by the number of observations in the data set.
Calculating the Mean Absolute Deviation for the Scrap-Rate Example . We can use the 5-step
process described above to calculate the scrap-rate MAD values for Departments A and B of the
scrap-rate example.
Calculate the MAD for Department A:
Step 1. Calculate the arithmetic mean of the data set. We have already calculated the mean
rate for Department A of the scrap-rate example — .05.
Step 2. Calculate the deviation (difference) between each observation and the mean of the
data set .
Department A (Fabrication)
X
.065 .050 .015
.035 .050 -.015
.042 .050 -.008
.058 .050 .008
.032 .050 -.018
.068 .050 .018
Step 3. Convert each deviation to its absolute value .
Department A (Fabrication)
X
.065 .050 .015 .015
.035 .050 -.015 .015
.042 .050 -.008 .008
.058 .050 .008 .008
.032 .050 -.018 .018
.068 .050 .018 .018
Step 4. Sum the absolute deviations .
Department A Fabrication)
X
.065 .050 .015 .015
.035 .050 -.015 .015
.042 .050 -.008 .008
.058 .050 .008 .008
.032 .050 -.018 .018
.068 .050 .018 .018
Total .082
Step 5. Divide the total absolute deviation by the number of observations in the data set.
Calculate the MAD for Department B:
Step 1. Calculate the arithmetic mean of the data set. We have also calculated the mean rate
for Department B of the scrap-rate example — .05.
Steps 2 – 4. Calculate the deviation between each observation and the mean of the data set;
convert the deviation to its absolute value; and sum the absolute deviations. The following
table demonstrates the three steps required to calculate the total absolute deviation for
Department B:
Department B (Assembly)
X
.050 .050 .000 .000
.048 .050 -.002 .002
.052 .050 .002 .002
.053 .050 .003 .003
.048 .050 -.002 .002
.049 .050 -.001 .001
Total .010
Step 5. Divide the total absolute deviation by the number of observations in the data set.
Compare MAD values for Department A and Department B:
The MAD for Department A is .014; the MAD for Department B is .002. Note that the MAD for
Department B is much smaller than the MAD for Department A. This comparison once again
confirms that there is less dispersion in the observations from Department B.
Calculating the Variance . Variance is one of the two most popular measures of dispersion (the
other is the standard deviation which is described below). The variance of a sample is the
average of the squared deviations between each observation and the mean.
However, statisticians have determined when you have a relatively small sample, you can get a
better estimate of the true population variance if you calculate variance by dividing the sum of
the squared deviations by n – 1, instead of n.
The term, n – 1, is known as the number of degrees of freedom that can be used to estimate
population variance.
This adjustment is necessary, because samples are usually more alike than the populations from
which they are taken. Without this adjustment, the sample variance is likely to underestimate the
true variation in the population. Division by n – 1 in a sense artificially inflates the sample
variance but in so doing, it makes the sample variance a better estimator of the population
variance. As the sample size increases, the relative affect of this adjustment decreases (e.g.,
dividing by four rather than five will have a greater affect on the quotient than dividing by 29
instead of 30).
To compute the variance, use this 5-step process:
Step 1. Calculate the arithmetic mean of the data set.
Step 2. Calculate the deviation (difference) between each observation and the mean of the data
set.
Step 3. Square each deviation.
Step 4. Sum the squared deviations.
Step 5. Divide the sum of the squared deviations by n-1.
Calculate the variance for Department A:
Step 1. Calculate the arithmetic mean of the data set. We have already calculated the mean
rate for Department A of the scrap-rate example — .05.
Step 2. Calculate the deviation (difference) between each observation and the mean of the
data set . The deviations for Department A are the same as we calculated in calculating
the mean absolute deviation.
Department A (Fabrication)
X
.065 .050 .015
.035 .050 -.015
.042 .050 -.008
.058 .050 .008
.032 .050 -.018
.068 .050 .018
Step 3. Square each deviation .
Department A (Fabrication)
X
.065 .050 .015 .000225
.035 .050 -.015 .000225
.042 .050 -.008 .000064
.058 .050 .008 .000064
.032 .050 -.018 .000324
.068 .050 .018 .000324
Step 4. Sum the total absolute deviations .
Department A (Fabrication)
X
.065 .050 .015 .000225
.035 .050 -.015 .000225
.042 .050 -.008 .000064
.058 .050 .008 .000064
.032 .050 -.018 .000324
.068 .050 .018 .000324
Total .001226
Step 5. Divide the sum of the squared deviations by n-1.
Calculate the variance for Department B
Step 1. Calculate the arithmetic mean of the data set. We have also calculated the mean rate
for Department B of the scrap-rate example — .05.
Steps 2 – 4. Calculate the deviation between each observation and the mean of the data set;
convert the deviation to its absolute value; and sum the absolute deviations. The following
table demonstrates the three steps required to calculate the total absolute deviation for
Department B:
Department B (Assembly)
X
.050 .050 .000 .000000
.048 .050 -.002 .000004
.052 .050 .002 .000004
.053 .050 .003 .000009
.048 .050 -.002 .000004
.049 .050 -.001 .000001
Step 5. Divide the sum of the squared deviations by n-1.
Compare variances for Department A and Department B:
The variance for Department A is .000245; the variance for Department B is .000004. Once
again, the variance comparison confirms that there is less dispersion in the observations from
Department B.
Concerns About Using the Variance as a Measure of Dispersion . There are two concerns
commonly raised about using the variance as a measure of dispersion:
• As the deviations between the observations and the mean grow, the variation grows much
faster, because all the deviations are squared in variance calculation.
• The variance is in a different denomination than the values of the data set. For example, if
the basic values are measured in feet, the variance is measured in square feet; if the basic
values are measured in terms of dollars, the variance is measured in terms of “square
dollars.”
Calculating the Standard Deviation . You can eliminate these two common concerns by using
the standard deviation — the square root of the variance.
For example: You can calculate the standard deviation for Departments A and B of the scrap-rate
example:
The standard deviation for Departments A and B of the scrap-rate example yields a standard
deviation of .015652 for Department A, and .002000 for Department B.
Note: Both the variance and the standard deviation give increasing weight to observations that
are further away from the mean. Because all values are squared, a single observation that is far
from the mean can substantially affect both the variance and the standard deviation.
Empirical Rule . The standard deviation has one characteristic that makes it extremely valuable
in statistical analysis. In a distribution of observations that is approximately symmetrical
(normal):
• The interval 1S includes approximately 68 percent of the total observations in the
population.
• The interval 2S includes approximately 95 percent of the total observations in the
population.
• The interval 3S includes approximately 99.7 percent of the total observations in the
population.
This relationship is actually a finding based on analysis of the normal distribution (bell shaped
curve) that will be presented later in the chapter.
Coefficient of Variation . Thus far we have only compared two samples with equal means. In
that situation, the smaller the standard deviation, the smaller the relative dispersion in the sample
observations. However, that is not necessarily true when the means of two samples are not equal.
If the means are not equal, you need a measure of relative dispersion. The coefficient of variation
(CV) is such a measure.
For example: Which of the following samples has more relative variation?
Calculate CV for Sample C:
Calculate CV for Sample D:
Compare the two CV values:
Even though the standard deviation for Sample D is twice as large as the standard deviation for
Sample C, the CV values demonstrate that Sample D exhibits less relative variation. This is true
because the mean for Sample D is so much larger than the mean for Sample C.
Note: We could calculate CV for the scrap-rate example, but such a calculation is not necessary
because the means of the two samples are equal.