When all values in a data set are the same, the sample variance is equal to zero.
To understand why this is the case, let’s first define what variance is. Variance measures the spread of data points in a data set. It quantifies how much the values deviate from the mean (average) of the data set. The formula for calculating the sample variance (
s2
)
is as follows:
s2 = ∑ (xi – x̄)2 / (n – 1)
Where:
- s2 = sample variance
- xi = each individual observation in the data set
- x̄ = sample mean (average of the observations)
- n = number of observations in the sample
Now, when all the values in the data set are the same, this means there is no deviation from the mean. Let’s suppose we have a set of n identical values, say 5, 5, 5, 5:
Mean (x̄) = (5 + 5 + 5 + 5) / 4 = 5
The deviation of each value from the mean is:
(5 - 5) = 0
Thus, the sum of squares of the deviations is:
∑ (xi - x̄)2 = 02 + 02 + 02 + 02 = 0
Since the sum of squares of the deviations is 0, we substitute this back into the variance formula:
s2 = 0 / (n - 1) = 0
Therefore, when all values are the same, the variance, which describes how much the data points spread out from the average, is 0 because there is no spread at all. This concept is fundamental in statistics and helps illustrate the importance of variability in a data set. Without variability, we have no uncertainty or diversity, resulting in a variance of zero.