# Scatter graphs

## In a nutshell

A scatter graph is a graph displaying plotted data points. The graph tells you how closely variables are related to each other as well as the kind of relation they have.

## Interpreting scatter graphs

Scatter graphs tell you about correlation between variables. They do **not** prove that one variable has an effect on another - correlation could happen because of a third factor or just by coincidence.

### Definitions

Correlation | How closely two variables are related | |

Outliers | Points on the graph that do not fit the trend | |

Line of best fit | A straight line drawn through the middle of points, as close to each one as possible | |

Strong correlation | Points make a fairly straight line | |

Weak correlation | Points stray more from a straight line, but there is still positive or negative correlation | |

Negative correlation | Sloping downhill from left to right | |

Positive correlation | Sloping uphill from left to right | |

No correlation | Points have no pattern | |

### Making predictions using scatter graphs

By drawing a line of best fit, you can make predictions using scatter graphs. It is important when drawing a line of best fit to ignore any outliers, as these could be potentially caused by an error that occurred during the experiment or when recording the data.

**Tip:*** When drawing a line of best fit, make sure there are the same amount of points above the line as there are below.*

##### Example 1

*This scatter graph shows the correlation between the number of books read per year and age.*

*Draw a line of best fit and predict how many books per year a thirty-five year old would read.*

*Reading up and along, you can predict that a thirty-five year old would read *$\underline {16 }$* books a year.*

**Note**: In the example above, the line of best fit has been 'extrapolated' to make a prediction about the future.

## Accuracy of predictions

**Extrapolation** | A type of estimation which goes outside the original range of observed or recorded values, based on the trends in the previous results. |

Whilst this does provide a useful prediction, it is not always reliable as you have no way of knowing whether the current trend in data will remain the same.