Variance and standard deviation In a nutshell The variance and the standard deviation are two measures that can be used to evaluate the spread of the data in a given set.
Variance This measure considers how much each point x x x deviates from the mean x ‾ \overline x x with the calculation ( x − x ‾ ) 2 (x-\overline x)^2 ( x − x ) 2 .
It is given in square units.
σ 2 = Σ ( x − x ‾ ) 2 n = Σ ( x 2 ) n − ( Σ x n ) 2 \sigma^2=\dfrac{\Sigma(x-\overline x)^2}{n}=\dfrac{\Sigma(x^2)}{n}-\left(\dfrac{\Sigma x}{n}\right)^2 σ 2 = n Σ ( x − x ) 2 = n Σ ( x 2 ) − ( n Σ x ) 2
σ 2 \sigma^2 σ 2
Variance of a data set
Σ ( x − x ‾ ) 2 {\Sigma (x-\overline x)^2} Σ ( x − x ) 2
Sum of the squares of each point's deviation from the mean
n n n
Number of elements in the data set
Note: For raw data, it is easier to use Σ x 2 n − ( Σ x n ) 2 \dfrac{\Sigma x^2}{n}-\left(\dfrac{\Sigma x}{n}\right)^2 n Σ x 2 − ( n Σ x ) 2 : "the mean of the squares minus the square of the mean".
Finding the variance in grouped data To find the variance in a grouped data set, such as in a frequency table, you can use the following formulae:
σ 2 = Σ f ( x − x ‾ ) 2 Σ f = Σ f x 2 Σ f − ( Σ f x Σ f ) 2 \sigma^2=\dfrac{\Sigma f(x-\overline x)^2}{\Sigma f}=\dfrac{\Sigma fx^2}{\Sigma f}-\left(\dfrac{\Sigma fx}{\Sigma f}\right)^2 σ 2 = Σ f Σ f ( x − x ) 2 = Σ f Σ f x 2 − ( Σ f Σ f x ) 2
Standard deviation This is the square root of the variance.
σ = Σ ( x − x ‾ ) 2 n = Σ ( x 2 ) n − ( Σ x n ) 2 \sigma=\sqrt\dfrac{\Sigma(x-\overline x)^2}{n}=\sqrt{\dfrac{\Sigma(x^2)}{n}-\left(\dfrac{\Sigma x}{n}\right)^2} σ = n Σ ( x − x ) 2 = n Σ ( x 2 ) − ( n Σ x ) 2
σ \sigma σ
Standard deviation of the data set
Σ ( x − x ‾ ) 2 {\Sigma (x-\overline x)^2} Σ ( x − x ) 2
Sum of the squares of each point's deviation from the mean
n n n
Number of elements in the data set
Finding the standard deviation in grouped data To find the standard deviation in a frequency table, use the following formulae:
σ = Σ f ( x − x ‾ ) 2 Σ f = Σ f x 2 Σ f − ( Σ f x Σ f ) 2 \sigma=\sqrt\dfrac{\Sigma f(x-\overline x)^2}{\Sigma f}=\sqrt{\dfrac{\Sigma fx^2}{\Sigma f}-\left(\dfrac{\Sigma fx}{\Sigma f}\right)^2} σ = Σ f Σ f ( x − x ) 2 = Σ f Σ f x 2 − ( Σ f Σ f x ) 2
Example 1 A company keeps track of the number of vacation days taken by their employees. The results are shown below.
Number of vacation days 18 18 18
19 19 19
20 20 20
21 21 21
22 22 22
23 23 23
24 24 24
Number of employees 1 1 1
4 4 4
5 5 5
8 8 8
10 10 10
3 3 3
1 1 1
Find the variance and the standard deviation.
Start by finding Σ f x \Sigma fx Σ f x and Σ f x 2 \Sigma fx^2 Σ f x 2 :
Σ f x = ( 18 × 1 ) + ( 19 × 4 ) + ( 20 × 5 ) + ( 21 × 8 ) + ( 22 × 10 ) + ( 23 × 3 ) + ( 24 × 1 ) = 675 \Sigma fx=(18\times1)+(19\times4)+(20\times5)+(21\times8)+(22\times10)+(23\times3)+(24\times1)=675 Σ f x = ( 18 × 1 ) + ( 19 × 4 ) + ( 20 × 5 ) + ( 21 × 8 ) + ( 22 × 10 ) + ( 23 × 3 ) + ( 24 × 1 ) = 675
Σ f x 2 = ( 1 8 2 × 1 ) + ( 1 9 2 × 4 ) + ( 2 0 2 × 5 ) + ( 2 1 2 × 8 ) + ( 2 2 2 × 10 ) + ( 2 3 2 × 3 ) + ( 2 4 2 × 1 ) = 14299 {\Sigma fx^2=(18^2\times1)+(19^2\times4)+(20^2\times5)+(21^2\times8)+(22^2\times10)+(23^2\times3)+(24^2\times1)=14299} Σ f x 2 = ( 1 8 2 × 1 ) + ( 1 9 2 × 4 ) + ( 2 0 2 × 5 ) + ( 2 1 2 × 8 ) + ( 2 2 2 × 10 ) + ( 2 3 2 × 3 ) + ( 2 4 2 × 1 ) = 14299
σ 2 = Σ f x 2 n − ( Σ f x n ) 2 = 14299 32 − ( 675 32 ) 2 = 1.89746... ‾ \sigma^2={\dfrac{\Sigma fx^2}{n}}-\left({\dfrac{\Sigma fx}{n}}\right)^2=\dfrac{14299}{32}-\left(\dfrac{675}{32}\right)^2=\underline{1.89746...} σ 2 = n Σ f x 2 − ( n Σ f x ) 2 = 32 14299 − ( 32 675 ) 2 = 1.89746...
σ = 1.89746... = 1.37748... ‾ \sigma=\sqrt{1.89746...}=\underline{1.37748...} σ = 1.89746... = 1.37748...
Note: If the data was given in a grouped data set, you could have calculated the variance and standard deviation with the midpoint of each class interval.