Removing the Clutter from Graphs

You can quickly get an audience to see patterns and trends in data if you present that data graphically. Data visualization is often the most persuasive of mediums; and yet, it’s so easy to get it very wrong. With any of the rich variety of available data tools, it is the work of a moment to create an impressive visualization of your data set, be it a simple pie chart, data bar, gauge, 3-D graph or map. Not quite what you’re after? Never mind, you can experiment with custom visualizations, add the fancy labels, fonts, colors and 3D animations that will really impress your audience! There are several dangers to doing this: you can easily distract from the impact, bemuse your audience, mislead others or, worse, hoodwink yourself. Most graphical packages provide an enormous range of 3D effects, icons and other graphical conceits. Just because you can use all of these things, it doesn’t mean you should.

In science, the practice of creating graphs and charts was enshrined with rules to prevent the possibility of misleading the reader. When marketing pressures are present, or there are other agendas, all bets are off. In scientific or engineering practice, good data visualization is really a process of subtraction, of removing from the viewer’s eye any clutter that isn’t entirely relevant to the point you need the data to convey. Edward Tufte’s seminal work on data visualization, The Visual Display of Quantitative Information, was published in the 80’s and emphasizes continually the need to find the simplest, most effective way to represent the “truth” of the data. As the author puts it, “every single pixel should testify directly to the content.”

By removing clutter, you have room to show more data without causing confusion. For example, the subtle use of different shadings of the same color in a map can reveal patterns in huge datasets. Likewise, simple techniques, such as use of “small multiples”, repeating the same graph multiple times with just one dimension changed, can be a highly effective way to reveal trends, as well as suggesting cause-and-effect in data. Sparklines, Tufte’s own invention, are a perfect example of this.

There is a simple way of removing clutter from graphs. Is the type of visualization you’ve chosen really the most effective way to represent that data? Is there a better, simpler way? If so change it. Is there anything that could distract from the basic message of that data? If so, then remove it. Remove also all the ‘chart junk’, such as non-standard fonts, excessive use of colors, animations, icons and other distractions. Remove redundancy (the same data point represented more than one way). Experiment with removing “metadata”, such as grid lines, labels, trend lines. If the story of the data seems weaker without one of these, only then put it back in. Edward Tufte describes this process as maximizing the “data-ink ratio” and it’s a principle that is ever more relevant.