Contents

Designing for Clarity in Data Science

Contents
Hello! Did you ever seen a chart without understanding nothing of what it's talking about? In this post I will show you how to avoid simple mistakes for your charts and how to make them more readable and comprehensible.

We will talk about Tufte’s design principles which emphasize making clear and easy to read visualizations.

A key aim for data visualization is to make the data as easy to interpret as possible. You want your visualizations to be credible because the last thing that you want as a designer is to have your entire project discredited because the data was displayed misleadingly.

This approach comes from Edward Tufte, who is an American statistician and a pioneer in the field of data visualization.

Tufte emphasized four main principles that we should follow to craft clear and readable data visualizations.

All of which focus on staying honest to the data and eliminating distracting elements

In terms of graphical integrity, data can easily be skewed by its presentation.

Just below, we have a very famous example from Fox news. They use a distorted Y-axis to exaggerate the difference between 35% and 40%. The graph on the right conveys the data more truthfully, and the effect isn’t quite as shocking.

source: https://www.kdnuggets.com/2012/12/taking-misleading-statistics-to-a-new-level.html

For this reason, it’s generally considered to be bad practice not to start a Y-axis from zero. It gives an instant shock impact, but it throws the whole argument into question if it’s viewed in a critical light.

Tufte defines the lie factor as the size of an effect shown in the graphic divided by the size of the effect as it’s measured in the actual data.

For a chart with high graphical integrity, this figure should be close to one, and there shouldn’t be any inherent bias that skews the perception.

Another famous example is shown here from a keynote speech from Steve Jobs in 2008.

source: https://paragraft.wordpress.com/2008/06/03/the-chart-junk-of-steve-jobs/

The tilting 3D pie chart makes the section at the bottom, representing Apple’s market share, seems much bigger than in reality.

Tufte coined the term “chart junk” to mean any elements on the screen that distract the viewer from the data whilst adding no extra information.

In general, anything 3D background graphics or any other unnecessary framing elements would be considered as charging according to Tufte.

Now, you should feel able to weigh up different visualizations according to some common rules for design. This is an important skill to develop as a data visualization engineer. You should be able to justify design decisions, that you make, to the people that you’re working with.

Moreover, you should have the vocabulary to constructively critique visualizations that can be improved.

However, it’s important to know that these views are just guidelines ad design is always a highly subjective matter.

I hope you enjoyed this article. Do not hesitate to ask your questions in DM. See you on!