# Covariance -- A Visual Walk Through

In a previous post, I’ve looked at walking through the calculation of variance and standard deviation, visualizing each step. This post is dedicated to the visualization of another statistic: covariance.

Covariance is a measure of the joint variability of two random variables.

Let’s have a look at the sample covariance equation over all:

$$cov(x,y) = \frac{\sum_{i=1}^n (x_i-\overline{x})(y_i-\overline{y})}{n-1}$$

And now lets apply the equation to the following case: Ready? Okay, now let’s walk through the calculation; there are 7 small steps:

# Step 1: find the mean of x:

## $$\overline{x}$$ # Step 2: find the mean of y

## $$\overline{y}$$ # Step 3: calculate difference between x and mean of x

## $$x_i-\overline{x}$$ # Step 4: calculate difference between y and mean of y

## $$y_i-\overline{y}$$ # Step 5: multiply these differences (observation-wise)

## $$(x_i-\overline{x})(y_i-\overline{y})$$ # Step 6: Add these areas

## $$\sum_1^n (x_i-\overline{x})(y_i-\overline{y})$$ # Step 7: Divide through by number of observations minus 1 (the result will a bit larger in magnitude than the average)

## $$cov(x,y) = \frac{\sum_{i=1}^n (x_i-\overline{x})(y_i-\overline{y})}{n-1}$$ That’s it.

Now we can compare this visualized result to what we would get if we simply trust the R covariance function to calculate this for us.

sum(df\$rectangle)/(nrow(df)-1)
##  0.4766744
cov(x,y) # Calculation for **sample** covariance
##  0.4766744

Great. It’s a match!

# Discussion question

What would the units of unadjusted covariance be for the covariance between life expectancy in years and per capita gdp in dollars?

Note: The normalized version of covariance is Pearson’s correlation coefficient.