Measures of Dispersion . As described in the previous section, the mean is the measure of

central tendency most commonly used in contract pricing. Though the mean for a data set is a

value around which the other values tend to cluster, it conveys no indication of the closeness of

the clustering (that is, the dispersion). All observations could be close to the mean or far away.

If you want an indication of how closely these other values are clustered around the mean, you

must look beyond measures of central tendency to measures of dispersion. This section will

examine:

• Several measures of absolute dispersion commonly used to describe the variation within a

data set — the range, mean absolute deviation, variance, and standard deviation.

• One measure of relative dispersion– the coefficient of variation.

Assume that you have the following scrap rate data for two contractor departments:

Contractor Scrap Rate Data

Month Dept. A, Fabrication Dept. B, Assembly

February .065 .050

March .035 .048

April .042 .052

May .058 .053

June .032 .048

July .068 .049

Total .300 .300

Mean .050 .050

The mean scrap rate for both departments is the same — 5 percent. However, the monthly scrap

rates in Department B show less variation (dispersion) around the mean. As a result, you would

probably feel more comfortable forecasting a scrap rate of 5 percent for Department B than you

would for Department A.

Differences in dispersion will not always be so obvious. The remainder of this section will

demonstrate how you can quantify dispersion using the five measures identified above.

Calculating the Range . Probably the quickest and easiest measure of dispersion to calculate is

the range. The range of a set of data is the difference between the highest and lowest observed

values. The higher the range, the greater the amount of variation in a data set.

R = H – L

Where:

R = Range

H = Highest observed value in the data set

L = Lowest observed value in the data set

Calculating the Range for the Scrap-Rate Example . By comparing the range for Department A

scrap-rate data with the range for Department B, you can easily determine that the historical data

from Department A shows greater dispersion.

Mean Absolute Deviation . The mean absolute deviation (MAD) is the average absolute

difference between the observed values and the arithmetic mean (average) for all values in the

data set.

If you subtracted the mean from each observation, some answers would be positive and some

negative. The sum of all the deviations (differences) will always be zero. That tells you nothing

about how far the average observation is from the mean. To make that computation, you can use

the absolute difference between each observation and the mean. An absolute difference is the

difference without consideration of sign and all absolute values are written as positive numbers.

If the average is eight and the observed value is six, the calculated difference is a negative two (6

- 8 = -2), but the absolute difference would be two (without the negative sign).

Note : Absolute values are identified using a vertical line before and after the value (e.g., |X|

identifies the absolute value of X).

To compute the MAD, use the following 5-step process:

Step 1 . Calculate the arithmetic mean of the data set.

Step 2 . Calculate the deviation (difference) between each observation and the mean of the data

set.

Step 3 . Convert each deviation to its absolute value (i.e., its value without considering the sign

of the deviation).

Step 4 . Sum the absolute deviations.

Step 5 . Divide the total absolute deviation by the number of observations in the data set.

Calculating the Mean Absolute Deviation for the Scrap-Rate Example . We can use the 5-step

process described above to calculate the scrap-rate MAD values for Departments A and B of the

scrap-rate example.

Calculate the MAD for Department A:

Step 1. Calculate the arithmetic mean of the data set. We have already calculated the mean

rate for Department A of the scrap-rate example — .05.

Step 2. Calculate the deviation (difference) between each observation and the mean of the

data set .

Department A (Fabrication)

X

.065 .050 .015

.035 .050 -.015

.042 .050 -.008

.058 .050 .008

.032 .050 -.018

.068 .050 .018

Step 3. Convert each deviation to its absolute value .

Department A (Fabrication)

X

.065 .050 .015 .015

.035 .050 -.015 .015

.042 .050 -.008 .008

.058 .050 .008 .008

.032 .050 -.018 .018

.068 .050 .018 .018

Step 4. Sum the absolute deviations .

Department A Fabrication)

X

.065 .050 .015 .015

.035 .050 -.015 .015

.042 .050 -.008 .008

.058 .050 .008 .008

.032 .050 -.018 .018

.068 .050 .018 .018

Total .082

Step 5. Divide the total absolute deviation by the number of observations in the data set.

Calculate the MAD for Department B:

Step 1. Calculate the arithmetic mean of the data set. We have also calculated the mean rate

for Department B of the scrap-rate example — .05.

Steps 2 – 4. Calculate the deviation between each observation and the mean of the data set;

convert the deviation to its absolute value; and sum the absolute deviations. The following

table demonstrates the three steps required to calculate the total absolute deviation for

Department B:

Department B (Assembly)

X

.050 .050 .000 .000

.048 .050 -.002 .002

.052 .050 .002 .002

.053 .050 .003 .003

.048 .050 -.002 .002

.049 .050 -.001 .001

Total .010

Step 5. Divide the total absolute deviation by the number of observations in the data set.

Compare MAD values for Department A and Department B:

The MAD for Department A is .014; the MAD for Department B is .002. Note that the MAD for

Department B is much smaller than the MAD for Department A. This comparison once again

confirms that there is less dispersion in the observations from Department B.

Calculating the Variance . Variance is one of the two most popular measures of dispersion (the

other is the standard deviation which is described below). The variance of a sample is the

average of the squared deviations between each observation and the mean.

However, statisticians have determined when you have a relatively small sample, you can get a

better estimate of the true population variance if you calculate variance by dividing the sum of

the squared deviations by n – 1, instead of n.

The term, n – 1, is known as the number of degrees of freedom that can be used to estimate

population variance.

This adjustment is necessary, because samples are usually more alike than the populations from

which they are taken. Without this adjustment, the sample variance is likely to underestimate the

true variation in the population. Division by n – 1 in a sense artificially inflates the sample

variance but in so doing, it makes the sample variance a better estimator of the population

variance. As the sample size increases, the relative affect of this adjustment decreases (e.g.,

dividing by four rather than five will have a greater affect on the quotient than dividing by 29

instead of 30).

To compute the variance, use this 5-step process:

Step 1. Calculate the arithmetic mean of the data set.

Step 2. Calculate the deviation (difference) between each observation and the mean of the data

set.

Step 3. Square each deviation.

Step 4. Sum the squared deviations.

Step 5. Divide the sum of the squared deviations by n-1.

Calculate the variance for Department A:

Step 1. Calculate the arithmetic mean of the data set. We have already calculated the mean

rate for Department A of the scrap-rate example — .05.

Step 2. Calculate the deviation (difference) between each observation and the mean of the

data set . The deviations for Department A are the same as we calculated in calculating

the mean absolute deviation.

Department A (Fabrication)

X

.065 .050 .015

.035 .050 -.015

.042 .050 -.008

.058 .050 .008

.032 .050 -.018

.068 .050 .018

Step 3. Square each deviation .

Department A (Fabrication)

X

.065 .050 .015 .000225

.035 .050 -.015 .000225

.042 .050 -.008 .000064

.058 .050 .008 .000064

.032 .050 -.018 .000324

.068 .050 .018 .000324

Step 4. Sum the total absolute deviations .

Department A (Fabrication)

X

.065 .050 .015 .000225

.035 .050 -.015 .000225

.042 .050 -.008 .000064

.058 .050 .008 .000064

.032 .050 -.018 .000324

.068 .050 .018 .000324

Total .001226

Step 5. Divide the sum of the squared deviations by n-1.

Calculate the variance for Department B

Step 1. Calculate the arithmetic mean of the data set. We have also calculated the mean rate

for Department B of the scrap-rate example — .05.

Steps 2 – 4. Calculate the deviation between each observation and the mean of the data set;

convert the deviation to its absolute value; and sum the absolute deviations. The following

table demonstrates the three steps required to calculate the total absolute deviation for

Department B:

Department B (Assembly)

X

.050 .050 .000 .000000

.048 .050 -.002 .000004

.052 .050 .002 .000004

.053 .050 .003 .000009

.048 .050 -.002 .000004

.049 .050 -.001 .000001

Step 5. Divide the sum of the squared deviations by n-1.

Compare variances for Department A and Department B:

The variance for Department A is .000245; the variance for Department B is .000004. Once

again, the variance comparison confirms that there is less dispersion in the observations from

Department B.

Concerns About Using the Variance as a Measure of Dispersion . There are two concerns

commonly raised about using the variance as a measure of dispersion:

• As the deviations between the observations and the mean grow, the variation grows much

faster, because all the deviations are squared in variance calculation.

• The variance is in a different denomination than the values of the data set. For example, if

the basic values are measured in feet, the variance is measured in square feet; if the basic

values are measured in terms of dollars, the variance is measured in terms of “square

dollars.”

Calculating the Standard Deviation . You can eliminate these two common concerns by using

the standard deviation — the square root of the variance.

For example: You can calculate the standard deviation for Departments A and B of the scrap-rate

example:

The standard deviation for Departments A and B of the scrap-rate example yields a standard

deviation of .015652 for Department A, and .002000 for Department B.

Note: Both the variance and the standard deviation give increasing weight to observations that

are further away from the mean. Because all values are squared, a single observation that is far

from the mean can substantially affect both the variance and the standard deviation.

Empirical Rule . The standard deviation has one characteristic that makes it extremely valuable

in statistical analysis. In a distribution of observations that is approximately symmetrical

(normal):

• The interval 1S includes approximately 68 percent of the total observations in the

population.

• The interval 2S includes approximately 95 percent of the total observations in the

population.

• The interval 3S includes approximately 99.7 percent of the total observations in the

population.

This relationship is actually a finding based on analysis of the normal distribution (bell shaped

curve) that will be presented later in the chapter.

Coefficient of Variation . Thus far we have only compared two samples with equal means. In

that situation, the smaller the standard deviation, the smaller the relative dispersion in the sample

observations. However, that is not necessarily true when the means of two samples are not equal.

If the means are not equal, you need a measure of relative dispersion. The coefficient of variation

(CV) is such a measure.

For example: Which of the following samples has more relative variation?

Calculate CV for Sample C:

Calculate CV for Sample D:

Compare the two CV values:

Even though the standard deviation for Sample D is twice as large as the standard deviation for

Sample C, the CV values demonstrate that Sample D exhibits less relative variation. This is true

because the mean for Sample D is so much larger than the mean for Sample C.

Note: We could calculate CV for the scrap-rate example, but such a calculation is not necessary

because the means of the two samples are equal.