Visualization
| 13 February 2017 | by Tim Bock

Using Correspondence Analysis to Find Patterns in Tables

Correspondence Analysis

There are lots of great visualizations designed for analyzing big quantities of data. Heatmaps, for example, are super-popular. However, when I am in a rush, my “go to” approach to analyzing big tables is almost always correspondence analysis. Although a bit more technical, it tends to get me to the key insights much faster.

 


 

A Correspondence Analysis Example

The data below shows the proportion of people to associate 15 personality attributes with 42 brands. It is too big, making it difficult readily digest.

 

The scatterplot below shows the results of a correspondence analysis of the same table. Correspondence analysis identifies the main relationships between the rows and columns on a table, and plots them on a two-dimensional map. You could have more dimensions, but as computer screens are two-dimensional, they tend not to be so good.

 

If you wish to do correspondence analysis yourself, or inspect these examples in more detail, check out Displayr.

 


 

Interpreting Correspondence Analysis

This chart is much simpler to digest than the whole table. At the bottom-left we can see that that Calvin Klein, American Express, Apple, and Lexus are Upper-class. Porsche mixes Upper-class and Daring. At the top-left, we can see that Tough is shared by Nike, Reebok, Levi’s and Michelin, which also are a bit Outdoorsy.

One key tip if you are new to correspondence analysis: the closer anything is to the middle of the map, the less distinct it is. Thus, on this map, we can see that Qantas is poorly described by any of the personality attributes. Similarly, Successful and Imaginative are personality attributes that are not good differentiators between the brands.

We can also see that a continuum of sorts is evident in the data.  It goes from from Upper-class and Intelligent at the bottom-left, through to Cheerful and Down-to-earth at the top-right.

As is always the case when we fit a model to data, there is no free lunch. Correspondence analysis just summarizes the data. Like many summaries, it can be superficial and at times misleading. For this reason, I always check that any key conclusions that I draw from a correspondence analysis are also clearly visible in the original data table or a heatmap.

If you wish to do correspondence analysis yourself, or inspect these examples in more detail, check out Displayr.

Author: Tim Bock

Tim Bock is the founder of Displayr. Tim is a data scientist, who has consulted, published academic papers, and won awards, for problems/techniques as diverse as neural networks, mixture models, data fusion, market segmentation, IPO pricing, small sample research, and data visualization. He has conducted data science projects for numerous companies, including Pfizer, Coca Cola, ACNielsen, KFC, Weight Watchers, Unilever, and Nestle. He is also the founder of Q www.qresearchsoftware.com, a data science product designed for survey research, which is used by all the world’s seven largest market research consultancies. He studied econometrics, maths, and marketing, and has a University Medal and PhD from the University of New South Wales (Australia’s leading research university), where he was an adjunct member of staff for 15 years.


Share
Twitter
Facebook
LinkedIn
GOOGLE
https://www.displayr.com/correspondence-analysis-to-find-patterns-in-tables/">
RSS
Follow by Email
follow us in feedly
Recent Posts

Explore Displayr

You can investigate the data set used in this example further or even work on your own data here.

TRY IT OUT



2 Comments. Share your thoughts.

  1. J

    I am using a phone to view this page, and the graphs displayed very poorly (half of the graphs are cropped out, and the page can’t be scrolled). Can you just use a normal .png instead?


Leave a Reply

Your email address will not be published. Required fields are marked *

Human? *

Keep updated with the latest in data science.