The normal distribution
In a nutshell
The normal distribution is a distribution used to model continuous random variables, and it appears in many real-life situations.
Definition
A continuous random variable is a random variable that can take infinitely many values.
Shape of the normal distribution
The graph of a normal distribution curve is shown below.
Due to its shape, it is also known as a bell curve.
Parameters of the normal distribution
The normal distribution depends on two parameters: the mean (μ) and variance (σ2) of the data set.
- The mean affects the horizontal displacement of the graph.
- The variance affects the "steepness" of the graph.
| |
Low value of σ2 | High value of σ2 |
This makes sense; variance represents the spread of data. A higher variance should mean that the data is more spread out, as the second graph above shows.
Finding probabilities from a normal distribution
If a continuous random variable X follows a normal distribution with mean μ and variance σ2, then write:
X∼N(μ,σ2)
The probability that X lies within an interval is given by the area under the corresponding curve within the interval.
|
A=P(a≤X≤b) |
Note that because continuous random variables have an infinite number of possibilities, the probability of any particular event occuring is zero. Therefore, the normal distribution is only useful when trying to find a cumulative probability.
Properties of the normal distribution
Here are some properties that are worth learning about the normal distribution.
- The normal distribution curve is symmetrical about the mean.
- The mean, median and mode of the normal distribution are all the same.
- Approximately 68% of the data in a normal distribution lies within one standard deviation of the mean.
- Approximately 95% of the data in a normal distribution lies within two standard deviations of the mean.
- Approximately 99.7% of the data in a normal distribution lies within three standard deviations of the mean.
- The total area under a normal distribution curve is equal to 1.
Example 1
The random variable X follows a normal distribution with mean 100 and variance 16. Find the following probabilities.
i) P(X≤100)
ii) P(92≤X≤108)
iii) P(100≤X<104)
Part i):
Use the fact that the normal distribution is symmetrical about the mean:
P(X≤100)=0.5
Part ii):
Use the fact that 95% of data lies within two standard deviations of the mean:
σ2=16→σ=4P(92≤X≤108)=P(100−2(4)≤X≤100+2(4))=P(μ−2σ≤X≤μ+2σ)=0.95
P(92≤X≤108)=0.95
Part iii):
Use the fact that 68% of the data lies within one standard deviation of the mean combined with the fact that the normal distribution is symmetrical about the mean:
P(100≤X≤104)=P(μ≤X≤μ+σ)=21×P(μ−σ≤μ≤μ+σ)=21×0.68=0.34
P(100≤X≤104)=0.34