Covariance

Covariance examples

Covariance is a single number we can calculate from a list of paired values.

It tells us if the paired values tend to rise together, or if one tends to rise as the other falls.

The Calculations

Imagine we have pairs of values (x,y), ..., we do these calculations:

Then for each pair of values:

And lastly:

And we get the covariance.

Example: Ice Cream Sales

The local ice cream shop keeps track of how much ice cream they sell versus the temperature on that day. Here are their figures for the last few days:

Ice Cream Sales vs Temperature
Temperature °C Ice Cream Sales
14.2° $215
16.4° $325
15.2° $332
22.6° $445
17.2° $408

Find the mean of the x values (temperature) by adding them up and dividing by how many:

mean of x = 14.2 + 16.4 + 15.2 + 22.6 + 17.25 = 17.12

Find the mean of the y values (sales in dollars):

mean of y = 215 + 325 + 332 + 445 + 4085 = 345

Then for each pair of values subtract mean of x from x, mean of y from y and multiply:

Add those results up and divide by n−1

379.6 + 14.4 +24.96 + 548 + 5.04(5−1) = 243

The answer is positive: that tells us the x and y values tend to rise together.

That is all it says. Not how strongly linked they are. Not how fast they rise or fall. Just that they tend to rise and fall together.

A negative result would say that x rises as y falls (and vice versa).

A zero result (rarely happens with statistical data) just means the covariance does not let us know if x and y rise or fall together.

Covariance:

  • positive says they rise and fall together
  • negative says that one rises as the other falls

The size of the covariance is not important. Imagine we used cents instead of dollars: we would get much larger values but the data still has the same relationship.

Note: the covariance can also be useful in other calculations.

Formula

As a formula covariance is:

1n−1
n
i=1
(xi−x̄)(yi−ȳ)

Where:

Upgrade to Correlation

Correlation has a few more steps in its calculation but also gives the useful result of telling us how well related x and y are.

Why n−1 ?

We divide by n−1 in the final step when our data is a sample (which it usually is) because we are unsure of the true population mean.

But if our data is the entire population then we should divide by n in the final step.