Everything to learn better...

Box plots

Select Lesson

Exam Board

Select an option

Explainer Video

Loading...
Tutor: Bilal

Summary

Box plots

​​In a nutshell

Box plots are diagrams which represent the spread and location of data. Box plots can be used to efficiently compare two or more data sets.  



Features of a box plot

Before drawing a box plot, certain features of a data set need to be defined and calculated. This includes the quartiles, median, maximum, minimum and outliers of a given data set. The table below defines and shows the formulae to calculate each of these features where applicable for a discrete data set with nn entries. 


feature

Definition

Formula 

Lower quartile  

(Q1Q_1)​

The value under which 25%25\% of data points are found. 

n+14th\dfrac{n+1}{4}^{\text{th}} point

Median (Q2Q_2​)​

The value under which 50%50\% of the data points are found.

n+12th\dfrac{n+1}{2}^{\text{th}} point

Upper quartile

 (Q3Q_3)​

The value under which 75%75\% of the data points are found.

3(n+1)4th\dfrac{3(n+1)}{4}^{\text{th}}

 point

Interquartile range (IQRIQR​)

The size of the range which contains the middle 50%50\% of the data points.

Q3Q1Q_3 - Q_1​​

Outliers

Extreme values which lie outside the normal trend of a data set which is determined by the constant, kk

​The constant kk will be given or implied in a question. 

Greater than 

Q3+k(IQR)Q_3 + k(IQR)​​

Less than

Q1k(IQR)Q_1 - k(IQR)​​

Maximum

Highest value of data set which is not an outlier or at the boundary of an outlier. 


Minimum

Lowest value of data set which is not an outlier or at the boundary of an outlier. 



Note: When calculating quartiles or the median, if the data point calculated ends in .5.5 then the value of the given quartile will be halfway between the data points above and below it. If the data point calculated is any other decimal number, round up to the next data point.


Example 1

Olivia is collecting data about the ages of her friends' siblings. The answers she collects are 1,1,2,2,3,4,5,5,5,5,7,7,7,7,81,1,2,2,3,4,5,5,5,5,7,7,7,7,8Given that there are no outliers, find the maximum, minimum, quartiles and median of ages collected.



Lower quartile

Position of Q1=n+14=15+14=4thQ_1 =\dfrac{n+1}{4} = \dfrac{15+1}{4}=4^{th}  data point

4th4^{th} value =2=\underline2

Upper quartile

Position of Q3=3(n+1)4=3(15+1)4=12thQ_3 =\dfrac{3(n+1)}{4} = \dfrac{3(15+1)}{4}=12^{th} data point


1212th12^{th} value =7=\underline7​​

Median

Q2=n+12=15+12=8thQ_2 = \dfrac{n+1}{2} = \dfrac{15+1}{2}= 8^{th} data point


8th8^{th} value =5= \underline5

Interquartile range

IQR=UQLQ=72=5IQR = UQ - LQ = 7-2=\underline5​​

Maximum

​​8\underline8​​

Minimum

​​1\underline1​​



Drawing a box plot

Drawing a box plot involves using an appropriate scale to mark all the features of a data set as defined above.


procedure

1.1.​​

Draw an appropriate scale and label it.

2.2.​​

Mark Q1Q_1​, Q2Q_2​ and Q3Q_3 with vertical lines of an equal length. Use Q1Q_1 and Q3Q_3 to draw a box which is separated by Q2Q_2.​

3.3.​​

Mark on the maximum and minimum with equally sized vertical lines and connect to the box with a line.

4.4.​​

Use a cross (x\text{x}) to denote any outliers beyond the minimum or maximum values.


Example 2

Draw a box plot to represent Olivia's data set.

Maths; Representation of data; KS5 Year 12; Box plots


Note: Sometimes outliers and quartiles will be given but not the maximum or minimum values of a data set. In this case, the effective maximum and minimum will respectively be the largest and smallest values which are not outliers.



Comparing box plots

Box plots can be used to compare the spread of data. Use the context of the question to compare the measure of location and the measure of spread. A measure of location is usually compared using the medians and spread is usually compared using the interquartile range and range. 


Example 3

The two box plots below summarise the A-level marks in 20212021​ and 20222022​ in a given school. Without calculating individual values, compare the box plots and give your interpretation.

Maths; Representation of data; KS5 Year 12; Box plots


Compare the plots:

The median mark for 20222022 is slightly lower than the median mark for 20212021The interquartile range and range for 20222022 is less than the range for 20212021


Interpret the data:

In 20212021, a larger proportion of students did better in their A-levels than in 20222022. But, students in 20222022 performed much more similarly with a tighter spread of results. 


Note: Sometimes the box plots will be on different scales. Ensure to compare them against the same scale. 



Create an account to read the summary

Exercises

Create an account to complete the exercises

FAQs - Frequently Asked Questions

What is the lower quartile?

What is the upper quartile?

What are box plots?

Beta

I'm Vulpy, your AI study buddy! Let's study together.