Correlation
In a nutshell
Bivariate data can be shown on a scatter graph. This helps you understand whether or not data is correlated.
Bivariate data
Bivariate data is the result of observing two variables from the same sample.
Example 1
Regarding a certain population, when registering the height and the weight of each person, you are creating bivariate data: you are relating each person's height with their weight.
Scatter graphs
Bivariate data can be shown on a scatter graph.
The horizontal axis shows the independent variable (the one you control) and the vertical axis shows the dependent variable (the one being measured).
The two variables in a scatter graph are often correlated:
Scatter diagram | Correlation | meaning |
| Strong positive correlation. | On average, when one variable increases, the other also increases. |
| Strong negative correlation. | On average, when one variable increases, the other decreases. |
| No correlation. | There is no clear relation between the two variables. |
Causality
Two variables will have a causal relationship if the change in one variable induces a change in the second.
However, just because two variables are correlated, it doesn't necessarily mean they have a causal relationship. Causation can only be deduced in the context of the data.
Example 2
Describe the correlation between the data shown on the graph. Is there a causal relationship between them?
You can see that when the x values increase, y values also increase. This can be shown by drawing a line of best fit:
Therefore, the two variables have a strong positive correlation.
However, you cannot state that there's a causal relationship between the variables. You don't know what the axes represent and so the context is not known.