Much of what's been written about data visualizations focuses on making visualizations that accurately reflect the data. That is, they focus on creating meaning.
A second way of creating good visualizations is to make them memorable, which is done by making them visually engaging.
Putting aside everything you've heard about Chart Junk, which of these attracts your attention?
Meaningful + 2
A third way, is to focus on creating the visualizations so they are instantly understandable. This is done by using various techniques to amplify the pattern in the data, so that we more quickly see what's going on.
24 techniques for improving visualizations
This infographic shows 24 technique for improving visualizations, grouped into the three broad categories.
Today I am going to explore a small set of the more advanced techniques. I'm focusing on the techniques that I find most market researchers don't know.
Who trump has attacked
The simplest way to create a memorable visualization is to create something visually striking. If it's dramatic we can remember it.
This is fun. But, it's not a great visualization. It makes you remember Octopuses with orange hair, rather than the story in the data.
This one tries to grab attention in a different way. It uses a design inspired by the subject matter.
I think it's beautiful. This makes it memorable.
But, it's a poor visualization.
When I look at it I see stairs. Stairs going down.
As the heading is talking about the market growing, but the bottom bar has lines through it, the visualization isn't instantly understandable.
Iraq's blood toll
This one's a bit hard to read, but it's a lot cleverer than the one before. Again, it uses the simple technique of a visual that's tied to the subject matter.
But, it does something a bit cleverer than that as well. The upside down column chart illustrates and resembles the flowing of blood.
So, the smart thing it's done is relate the pattern in the data to the subject matter.
It's not just attracting attention. It's a nemonic of sorts.
In terms of design quality, this one's a bit paint by numbers. I did it myself.
A wallet and the color of money evoke the subject matter.
But, at a deeper level it's doing something cleverer. The core bit of the visualization is an idea we are familiar with.
A drop in the ocean.
A needle in a haystack.
A speck of dust.
It emphasizes that the initial investment is dwarfed at a monumental level by the outcome.
All the visualizations that I have shown you so far require a creative spark. That's all fine, but what do you do when you can't find the creative spark or don't have the design skills?
That's my focus today. Understanding the science, rather than the art, of creating impactful visualizations.
When we look at a visualization we give it the benefit of the doubt. This benefit only lasts instant. Once the instant has passed, we're bored. We move on. The chance to communicate has been lost.
The creation of Adam
An artist friend tells me this is not great art. It's a cartoon. But, in only a matter of seconds we get it. It talks to us. We are captivated by Michelangelo's work. We can extract meaning from it.
But this? It looks like bricks. I think this is why most people don't like modern art. They can't instinctively extract meaning from Mondrian's work. It's too hard. They just move on.
The greatest visualization of all time
In books on visualizations, this one's often described as the best of all time. But, is it? What do you see when you look at it? I see a tree branch. Yes, if you study it and read books on it, it is actually a very cool sankey diagram.
But, at an instinctive level, it doesn't work, as I don't look at branches in search of meaning. This is more Mondrian than Michelangelo.
Why can't we instinctively extract meaning? Let's do some thought experiments.
You have five seconds
You have 5 seconds to remember what is on the next slide.
What did it say?
Can you remember?
You have 5 seconds
Let's try it again. You've got 5 seconds.
What was the word?
Even if you don’t know your Mary Poppins, there’s a good chance you can remember some of the word, as it was made up of words you know.
Our brains find it easier to understand and thus recall things that are familiar.
Successful visualizations are ones that are in some sense familiar.
Great visualizations use shapes that we instinctively understand.
Rather than show data in all its ugliness, we should work to exagerate the patterns in visualizations to make them clear.
And, we want to exaggerate the shape if at all possible.
To use some jargon: We want to supernormalize the visualization.
The herring gull's world
Dutch Nobel Laurette Niko Tinbergen did a cruel but fascinating study. He noticed that birds tended to instinctively peck at their mothers' beaks as soon as they hatched. So, he took away their mothers to see what they would do, and showed them a cardboard cut-out instead. They pecked at it.
He then showed them a two beaked monster. They pecked at it.
A thin red…
His fascinating finding was that they pecked more ferociously at a red rod with three white lines than a plaster model of their mother’s head.
Because the instincts in the brain have evolved to having a simplified concept of what to peck at, and the supernormal stimulus of the red bar better taps into this.
Let's work through some examples and see what we can do with this idea.
We instinctively understand intensity
We instinctively understand intensity. We know what burns us.
Treemap with heatmap shading
What can you see here?
Yes, the small cell with the small font, Typhoid, jumps out at us.
Now, some people hear this and think, oh, so tree maps are good. No, it's not about the type of visualization. It's about the use of color to communicate intensity, in a way that taps into how we have evolved to evaluate stimuli in the real world..
Treemap with unnatural heatmap shading
Here's another tree map. This grey, red, green scale is not natural, which makes this visualization hard to process. Where should our eye be drawn to look?
Car switching data
This data shows switching between car brands.
Let's look at the Ford row.
63 Ford owners switched to BMW.
107 to Citriown.
64 to Fiat.
4 thousand stayed loyal to Ford.
How should we visualize this?
Chord Diagrams were invented for this.
Recent > ...Chord Diagram
Switching matrix: car.switching.
Let's look at Ford.
So, we learn that its biggest competitors are Rover and GM.
Oh so cool. But, not so natural to read. What else can you?
We would have to spend a lot of time interacting with this to extract any insight.
It looks like the Great Pit of Carkoon from Return of the Jedi. Fascinating Yes, but not somewhere we expect to find a pattern.
We instinctively understand space
By contrast, we instinctively understand how to interpret objects placed in two dimensional space.
Correspondence analysis of a square table
Recent > Correspndence analysis of a square table > car.switching.
Labels font size: 16
And this is why I can show you an exotic technique called Correspondence Analysis of a Square Asymmetric Table, and you can quickly get a feeling for the entirety of the table. See what you can learn about who Ford competes with? What about BMW?
Not as sexy as the chord diagram, but much easier on your brain.
Perceptions of supermarkets
Here's some data on perceptions of supermarkets. Let's have a go at visualizing it.
Visualizing perceptions of supermarkets
I'm going to do something a bit exotic, and create something you've probably not seen before called a moonplot
+Anything > Recent > Correspondence Analysis ... Table
While this style of visualization may new to you, your brain instinctively allows you to understand how Aldi is different.
But, this is not by any means perfect. While it tells us that Aldi is closer to Low prices than Costco, it's hard to know how much closer. The visualization is only communicating relativities.
What's the type of visualization that most market researchers would do?
Insert > Visualization > Radar chart
If you look, you can see that it also telling us that Aldi owns Low prices.
But, it's not a great visualization. What is your brain seeing?
This is what I see
We instinctively compare shapes
We instinctively compare shapes.
Now, this is much better.
Your brain instinctively understands that we need to compare shapes. Without even reading the labels, you know that Aldi and Harris Farm are different, as their shapes are so different.
So, why do our brains find it so much easier?
The jargon for what we have done is we've created small multiples. That is, multiple small visualizations, which our brains are good at comparing.
Let's do a little visual experiment to understand why our brains like small multiples
Spot the odd one out
And what's the odd one out here?
And what's the odd one out here?
This last one is just quite a lot harder on our brain.
For those of you that didn't spot it, I've highlighted it.
What makes the one on the right so much harder?
In the one on the left, we have encoded information in three dimensions.
The horizontal and vertical position, and color. Our brains can do that.
In the middle one, again three. We've replaced color with rotation.
But, the one on the right uses four dimensions.
Adding each dimension makes our brains work harder to decode meaning .
It slowwwwws us down..
So, the point here is that we want to use as few visual dimensions as we possibly can.
When designing visualizations, we need to make the patterns in data really really hard to miss.
OK, so we want to use as few dimensions as we can. But, which dimensions should we use?
What is the pattern?
Here I have encoded data into four graphical dimensions.
What's the pattern?
There's a correlation here. Can you see it?
Just in case you didn't spot it, darker shades are more horizontal. The pattern is a perfect correlation. It's encoded in two only two dimensions, shape and rotation, but we couldn't easily spot it.
What's the pattern here?
That's right. It is easier to see. The higher the vertical position, the darker the shading.
What made this one easier to see?
The previous one encoded information in rotation. We can process that info. But, it was hard. Here we replaced rotation with vertical position, and our brains found it much easier to spot the correlation between the shading and the vertical position.
We instinctively understand height
We looked before at how we instinctively understand two dimensional space.
This also means we instinctively understand height.
This is why column charts are so good.
And now, the pattern cannot be missed. There is perfect correlation between the horizontal and vertical position
The shading is irrelevant. The rotation is irrelevant.
It's unmissable because the key pattern is communicated by horizontal and vertical coordinates.
Each visualization shows a perfect correlation. Each time it's encoded in only two graphical dimensions. But, it was only easy to spot when the pattern was encoded in the horizontal and vertical coordinates.
This experiment's not an accident. Our brains are awesome at decoding information in horizontal and vertical space. So, we should always strive to create visualization where the key patterns are encoded this way.
TV trends data
Here's some data on TV trends.
How should we plot it?
We instinctively understand line and area charts
Through our history we have needed to judge the slopes of mountains and hills, so it is no surprise we find line and area charts easy to interpret.
But, not all line and area charts.
Visualizing data on TV shows
Since market research was invented, the line chart has been the "go to" way of visualizing time series
Insert > Visualization > Line Chart
Output in pages: tv.trends
What can you see here
That's right. A mess.
If you have amazing color perception, you can perhaps see that Game of Thrones has some big spikes. That's about it.
The underlying data here is mutually exclusive, so we can make things a bit better using stacked area charts.
Chart type: Area
Check Stack series
What can you see here?
Yes, Game of thrones has some big spikes. And, Walking dead, in the brown, seems pretty steady.
But, still not great.
Chart type: Stream
We can make it better by using a stream graph.
Stream graphs are great eye candy.
But, they are also quite clever.
Looking here, we can clearly see the Game of Thrones peaks
Hover over bottom blue
But, we can see stuff that wasn't obvious before.
Hover over middle blue
We can see that Stranger things has peaked at the same time as Game of Thrones, albet to a much lessor extent.
I want to draw attention to a visualization principle that underlies why this visualization is better than the stacked area chart
By being kind of symmetrical around the horizontal, it frees up more opportunity for us to see variations.
It turns out this is a general principle of visualization. Making things look a bit symmetrical is often good.
But, there's something we can do even better here.
Chart type > Area
Check : Show as small multiples
Chart type: Opacity: 1
This is an awesome chart.
No, it's not as cool as the stream graph. But, it's just so much more useful useful. So easy to read.
So why do our brains find the small multiples so much easier?
- We've stopped using color as an important variable, and doubled down on ur ability to see simple patterns organized in two dimensions. You can't read a line chart without colors usually. Here, we can.
OK, so hopefully I've persuaded you that small multiples are a bit of magic.
What's another technique we should use?
We can encode the same pattern into visualizations multiple times. That is, to perform what's called redundant encoding.
A small table
Let's start with a small example, involving average consumption of different brands of cola.
Most people show this as a bar chart.
Insert > Visualization > Bar
Output in pages: table.Q9.Cola.drinking.frequency
Like most bar charts, a challenge with this one is that we have to read down to the axes to see what it means.
Having to look for such information slows the reader down.
So, we can improve it by adding labels and removing the x axis
Chart > DATA LABELS > Show data labels
Decimal places: 1
VALUES (X) AXIS: Show Axis Title
And we should of course sort it
But there's something else interesting about this type of data. The numbers being plotted between 0 and 7.
Why's that important?
It means the data is easily countable by the human brain.
We can tap into this by creating a pictograph.
Inputs > Bar > Bar Pictograph
I will have to tell the pictograph to scale it so that an average of 1 is one picture.
Chart > Units per icon (scale): 1
That's much better.
But, why men?
We could also put a custom icon
Note that color here is entirely uninformative. It's just a distraction. Irrelevant encodings slow us down.
Data Series: Color palette: Custom Gradient
I will put in a custom gradient, starting with Black.
Gradient start: our black
Gradient end: our red
Note that Pepsi light is a different shade to Pepsi. Why is that? It's doing the coloring based on the order of the categories. We can hook it up to be proportional to the data.
Gradient start: More colors > Drag the pont across to the left
We've done quite a few things to make this visualization easy to intepret. They are all examples of a general principle, known as redundant encoding.
Swap around black and red
Here we have only one pattern, but we have encoded in six different ways
The numbers show the result
The end points of the bars shows the result
The area of the bars is proportional to the result.
The user can count the number of drinks
The shading shows the result
The rank order shows the result
With one pattern shown in six different ways, it's hard for a user to miss the point.
Long lists (e.g., brands)
Now here's a problem that every market researcher faces. How do you visualize a table with lots of rows.
We are good at interpreting heights, so column charts are often the place to start.
Insert > Visualization > Column Chart
Output in pages > table.Spontaneous.awareness
OK, this is crap.
The standard fix is to merge all the little brands into an "other" category. But, back in my consulting days that often meant I had to then re-run things to include some tiny pet project brands that the client cared about but I hadn't known about when doing the chart.
What about a bar chart?
Chart type: Bar
Still a mess, but at least we can read the labels a bit easier
But, it's skipping some brands as there's not enough room for all the labels.
We can't even see the biggest brand
We instinctively understand proportionality
We can all tell how much of a cake has been eaten. This is why no matter how many people tell us that pie and donut charts are rubbish, they remain so popular. They work.
Donut and pie charts have a bad reputation, but their main problem is that most of them are just badly created. Let's see how one works when it's been designed for market research.
Chart type: Donut.
Still not great, but note that at least we can see all the labels for the bigger brands.
What should we do next?
This one you know. We need to sort.
Click on table, sort by values %, descrending
Now, we've got a much better chart.
The thing to note here is that the colors are just a pure distraction. We can make it better if we reduce the variation in the colors
There are lots of color palette choices here.
Chart > Data Series > Color Palette > Reds, Dark to Light
Let's say that T-Mobile is our client, how can we improve this more?
Reduce color + Emphasize
Two techniques that often work well in tandem, are reducing color and then empahsizizing.
I'm just going to paste in the hex codes for a whole lot of greys.
Color Palette: Custom palette
Paste in #c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0,#c0c0c0
Now, I mentioned before that our client is T Mobile. They have trademarked their own pink
Paste in as third color: #ea0a8e
Banking to 45 degrees
Banking to 45 degrees is one of my favorite techniques.
A case study. Not market research data, but too beautiful not to show.
Dark spots appear on the sun. Sun spots. They may last for a few days or months.
This data set tracks the number of sun spots over 250 years.
There are 3,177 data points .
I'll plot them as an area chart..
Insert > Area chart
Output in pages: sunspot.month
What can you see?
First, it osolates, going up and down every 11 years.
Second, there's a bigger osolation in the trend data, which goes down from 1800 to 1820, then down again in 1880s and so on.
What else can you see?
OK. Now for the magic. I'm going to shrink the chart.
What can you see now? Squint. It's worth it.
That's right. The sun spots increase at a much faster rate than they decrease.
Start at bump two to the left of 1800.
See here, it spikes up. Then, at more gradually decreases.
This is a common pattern.
Banking to 45 degrees
It turns out if we change the height and width so that the average of the data we want to examine is about 45 degrees, we create a visualization that's much easier to see.
No need to be too mathematical. Your eye is good enough.
Attitudes to political institutions
Here's a second example. We're looking at some survey data showing attitudes toward various intuitions.
Which is the better chart?
Clearly the one on the right.
In it, I've done four things. The data has been banked to 45 degrees, as just discussed.
- used lots of grey,
- emphasized the key pattern I'm interested in, and
- removed the amount of eye lookups by getting rid of the legend.
Note the really cool thing about banking to 45 degrees.
It leads to smaller charts. That is, the chart no longer takes up nearly as much room. So, we can add in more commentary or other insights if we wish.
Why does the banking work? Here's my theory.
Lean back. What do you see?
Lean back. What do you see?
Lean in. What do you see?
Lean in. What do you see?
Lean in all the way until your nose hits your screen. What do you see? A gremlin. Look at the right, there's a toucan as well with a big claw.
If we are 50 cm…
For optimal viewing, we want an image to be in an angle from our eyes of about 4 to 6 degrees from our retina. This means that if we are viewing from about 20 inches/50cm away, our ideal image is only about 2 inches/5cm high.
This explains a bit why banking works. We are wanting key features of the data to be visible in a small area, about 2 inches by 2 inches. Resizing things so that they are at 45 degrees most efficiently achieves this.
It also helps us further understand why small multiples work. Each of the small visualizations is around this optimal viewing size.
This is my all time favorite technique. I even wrote the wikipedia page!
The basic idea is that you rearrange the rows and columns of charts, tables, and small multiples, so that diagonal patterns appear.
Universal commercial history
And this my all-time favorite visualization. Each area shows the size of an economy, from 2000BC through to 1804.
Look at what the war of independence did to the US economy.
Note it's using small multiples.
Note that the curves are banked the extent possible.
The genius that made this has ordered the economies so hat their bumps make a diagonal line. That the diagonal line isn't straight, emphasizes something very profound.
There are no bumps between the birth of Christ and the renaissance, and the dark ages, where only Constantinople thrived, is made stark.
Look at this big table. So many columns we can't even view it.
If we inserted a heatmap, we can see everything.
Insert > Visualization > Heatmap
Output in pages: Table.big….
YAXIS > Font size: 10
X Axis > Font size: 10
It looks like a pretty random pattern of tiles. What's the story?
The secret to a large table like this is to diagonalize it. That is, rearrange the rows and columns to make the patterns clearer.
As it's got 27 columns and 6 rows, it would take a long time to do it by judgement.
Fortunately, you can do it algorithmically.
Select the table
Sort > Sort Columns by Pattern
a. Sort Rows by Pattern
Note we know have a diagonal pattern.
Looking at the block at the top left, it's grouped together the four sugar free products.
We can see that they are all relatively strong in terms of Weight conscious and health conscious.
And, looking at the bottom row, we can see that Coke and Pepsi are pretty similar, and the skew to being traditional, order, reliable.
So, we are moving rows around, and we are doing so until we've created a diagonal pattern in the data.
You can see It here.
Why is this good?
At a very basic level, the diagonal line is a pattern, which our brain can understand.
Then, once we see the pattern, we can attribute meaning to it.
At the top left, Coke and Pepsi are very similar.
What do they have in common? The skew more to being traditional, older, reliable, honest.
Coke's darker so stronger than Pepsi. But, they're pretty similar.
Pepsi Max and Coke Zero are similar. But, Pepsi Max is a bit more masculine, confident and tough. It's closer to Coke and Pepsi.
Coke Zero is much more oriented towards weight and health conscious.
Diet Coke and Diet Pepsi are really strong with Weight and health, but a bit more feminine and innocent.
Visualizing perceptions of supermarkets
Here's our earlier small multiples of radar charts.
- I've reordered these to make the patterns easier to see.
- And, to further emphasize this, I've plotted the averages
- Note we can now see that on the left, Woolwroths and Coles are basically the same. They are above average on except Woolworths is a bit light on price.
- All the other brands are weak on most things, with points of differentiation more obvious.
This is awesome. Not pretty. But awesome. The small multiples have been lined up to make patterns clear. If I say to you, which industries first declined and then picked themselves up, you can quickly work it out
We've covered most of the 24 tecniques here.