Misleading Graphs and Stats

Numbers don't lie, but they can be used to stretch the truth!

Here we have collected some common ways data can be misleading.

Cut the Y-Axis

This is the most common "trick" in graphing.

Bar graph with vertical axis starting at 0, showing accurate proportions between the bars
vs
Bar graph with vertical axis starting at 700, making small differences look very large

The vertical axis (the Y-axis) should usually start at zero. When it starts at a higher number, it makes small differences look huge.

Pictograph Pitfalls

Pictographs use images to show data.

When a value doubles, there should be double the number of icons, not double sized icons.

A Ball:
soccer ball
Twice as Many:
soccer ballsoccer ball
4 Times the Area or
8 Times the Volume:
soccer ball

An image that's twice as tall and twice as wide, is actually four times the area. Our eyes see the area, so we think the change is much bigger than it really is.

Small or Chosen Samples

Illustration of selecting a diverse group of random individuals from a larger crowd

A small sample size can be very inaccurate.

Try asking just a few people for their favorite ice cream!

We need hundreds to thousands of people to take any sample seriously

And the sample needs to be taken randomly! What if you only ask people at a pool if they like swimming?

Learn more at Sampling.

What's Average?

When someone says "The average is...", they may be choosing the version of "average" that supports their argument best.

Number line with a cluster of points near 10 and a single outlier far to the right at 90

A mean can be pulled way up or down by a single outlier (a very high or low number).

In fact the median (the middle value) is often a better "typical" look at data.

If a mean is much higher or lower than a median there's likely a very large value affecting the data!

The Billionaire Effect: If nine people who each earn $30,000 a year sit in a room with a billionaire, the mean (average) income of the room is over $100 million! But the median (middle) income is still $30,000. The median gives a much truer picture of a typical person in that room.

Cherry Picking

This is when someone only shows you a small piece of the data "map".

If a company shows a graph of their profits for only the last two months where they went up, but hides the previous ten months where they crashed, that's Cherry Picking. They are picking the "best" parts to show you.

"Correlation Isn't Causation"

Just because two things happen at the same time doesn't mean one caused the other.

Example: Sunglasses vs Ice Cream

Our Ice Cream shop finds how many sunglasses were sold by a big store for each day and compares them to their ice cream sales:

Scatter plot showing a strong positive correlation between sunglasses sold and ice cream sales

The correlation between Sunglasses and Ice Cream sales is high

Does this mean that sunglasses make people want ice cream?

Or maybe when it is hot and sunny, more people buy sunglasses and more people buy ice cream. The two trends are linked by a third, hidden cause.

See correlation for more.