The sample standard deviation is a descriptive statistic that measures the spread of a quantitative data set. This number can be any non-negative real number. Since zero is a nonnegative real number, it seems worthwhile to ask, “When will the sample standard deviation be equal to zero?” This occurs in the very special and highly unusual case when all of our data values are exactly the same. We will explore the reasons why.
Description of the Standard Deviation
Two important questions that we typically want to answer about a data set include:
- What is the center of the dataset?
- How spread out is the set of data?
There are different measurements, called descriptive statistics that answer these questions. For example, the center of the data, also known as the average, can be described in terms of the mean, median or mode. Other statistics, which are less well-known, can be used such as the midhinge or the trimean.
For the spread of our data, we could use the range, the interquartile range or the standard deviation. The standard deviation is paired with the mean to quantify the spread of our data. We can then use this number to compare multiple data sets. The greater our standard deviation is, then the greater the spread is.
So let's consider from this description what it would mean to have a standard deviation of zero. This would indicate that there is no spread at all in our data set. All of the individual data values would be clumped together at a single value. Since there would only be one value that our data could have, this value would constitute the mean of our sample.
In this situation, when all of our data values are the same, there would be no variation whatsoever. Intuitively it makes sense that the standard deviation of such a data set would be zero.
The sample standard deviation is defined by a formula. So any statement such as the one above should be proved by using this formula. We begin with a data set that fits the description above: all values are identical, and there are n values equal to x.
We calculate the mean of this data set and see that it is
x = (x + x +… + x)/n = nx/n = x.
Now when we calculate the individual deviations from the mean, we see that all of these deviations are zero. Consequently, the variance and also the standard deviation are both equal to zero too.
Necessary and Sufficient
We see that if the data set displays no variation, then its standard deviation is zero. We may ask if the converse of this statement is also true. To see if it is, we will use the formula for standard deviation again. This time, however, we will set the standard deviation equal to zero. We will make no assumptions about our data set, but will see what setting s = 0 implies
Suppose that the standard deviation of a data set is equal to zero. This would imply that the sample variance s2 is also equal to zero. The result is the equation:
0 = (1/(n - 1)) ∑ (xi - x )2
We multiply both sides of the equation by n - 1 and see that the sum of the squared deviations is equal to zero. Since we are working with real numbers, the only way for this to occur is for every one of the squared deviations to be equal to zero. This means that for every i, the term (xi - x )2 = 0.
We now take the square root of the above equation and see that every deviation from the mean must be equal to zero. Since for all i,
xi - x = 0
This means that every data value is equal to the mean. This result along with the one above allows us to say that the sample standard deviation of a data set is zero if and only if all of its values are identical.