Categories: Business Intelligence

Why I hate Pie Charts

The pie chart should be avoided at all costs. Why?

  • you can only display a limited number of slices (although that doesn’t really stop people from creating pie charts with dozens of slices)
  • they take up a lot of space
  • the human brain cannot easily compare slices based on radial areas. We’re better in comparing areas like squares and rectangles but length is preferred, such as in a bar chart.
  • to make it more effective, you have to add labels and/or a legend, which makes it only worse
  • sometimes people forget pie charts represent a “part of a whole” relationship and if you add the percentages together, you end up over 100%

For these reasons, it’s almost always better to replace the pie chart with a bar or column chart. The only exception is when you have a very limited number of slices (maximum 3 but I prefer 2). In this case, the “part of a whole” relationship can be quite accurately displayed. It’s useful for “omg look how big that slice is compared to the other tiny slice”. An example:

If you do have to create a pie chart, adhere to these simple rules:

  • start the first slice at 12 o’clock. Thanks to human evolution, we can at least somewhat decently read an analog clock. This means we can quite accurately read the size of the first slice.
  • sort the slices. Preferably by size, starting with the biggest one because that’s probably where you want to focus on.
  • keep the number of slices down. Combine the smallest slices into one bigger slice and label it “Other…” or something like that.

Now, why the reason of this blog post (most of this has already been described before)? I was recently reading the blog post What’s More Popular: SQL Server 2014, or SQL Server 2005? by Brent Ozar (blog | twitter) and it linked to yet another fine specimen of pie chart junk (note: the chart was not created by Brent, just to be clear). It’s a very interesting post – with an interesting discussion in the comments as well – about the adaptation rate of the different SQL Server versions among a large sample of Dell servers.

The problems are quite clear, since the chart doesn’t follow any of the rules I explained earlier. There are too many slices and thus too many colors as well. Could you see that 2014 was the double in size of 2000, without using the labels?

I imported the data in Power BI and quickly created this column chart:

Much more clear, isn’t it? And only one color needs to be used. In the discussion on Brent’s blog someone suggested to sort the data not on size, but rather on release date, which makes this graph actually much better since there is some sort of time aspect related to the data. It’s obvious to see now 2005 is still more popular than 2008 and 2014. You can now also clearly see 2014 is bigger than 2000.

The chart has a skewed normal distribution to the right, which might be expected for adaptation rates of a technology product, but you can see 2005 and 2014 are somewhat outliers. I tend to believe SQL Server 2014 didn’t really had anything substantial to offer; certainly not for BI – aside from the clustered columnstore index, which on itself is not a reason to upgrade – and if your shop didn’t need in-memory OLTP there was no actual reason to upgrade.

SQL Server 2016 on the other hand, will be awesome 🙂


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Recent Posts

Book Review – Agile Data Warehouse Design

I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…

1 month ago

Cloudbrew 2024 – Slides

You can find the slides for the session Building the €100 data warehouse with the…

1 month ago

Book Review – Microsoft Power BI Performance Best Practices

I was asked to do a review of the book Microsoft Power BI Performance Best…

2 months ago

Create a Numbers Table in Power Query

This is a quick blog post, mainly so I have the code available if I…

2 months ago

Microsoft finally adds Tenant Switcher for Fabric / Power BI

Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…

2 months ago

Book Review – Humanizing Data Strategy by Tiankai Feng

This book was making its rounds on social media, and the concept seems interesting enough…

2 months ago