Correspondence analysis is a data analysis technique which summarizes the patterns in a table of data as a visualization. Tables with more than a handful of rows and columns can often become difficult to read, and the patterns can get hidden among the digits. Correspondence analysis can make it much easier to see the story.
There are a few variations on the technique of correspondence analysis (including correspondence analysis of square tables, multiple correspondence analysis, and correspondence of multiple tables), but in this post I focus on the most common technique, which could be called traditional correspondence analysis. This is a technique originally derived to understand the patterns in contingency tables, but it can readily be applied to other kinds of data as well. In this post, I show you how to set up your correspondence analysis in Displayr.
Like all data analysis, there are a range of issues to keep in mind as you conduct your analysis and interpret the results. For a deep dive on the topic area, check out our eBook and other posts on the topic.
Step 1 - Create your table
The starting point of your analysis is the table of data that you want to analyze. While the original application of correspondence analysis was for contingency tables of counts, the technique works effectively for a range of other kinds of data so long as the data is all on the same scale. This includes crosstabs showing counts, percentages, or averages, grids of data created from binary variables, and even raw numeric data.
There are three main ways that you can add a table to Displayr:
- Paste in your data.
- Use Displayr's built-in statistical engine to compute the table from raw data.
- Use R to compute a table.
I will briefly explain how to do these below.
Option A - Paste in data
The simplest path to a correspondence analysis is when you already have the table you want to analyze. In this case you can just paste it right in. To do so:
- Select Home > Paste Table.
- Click Type or paste data in the Object Inspector on the right side of the screen.
- Paste in your table of numbers into the spreadsheet-style interface (like below) and click OK.
- Click Calculate .
Your table will appear on your page. If you don't need to use the table for another analysis or visualization, you can also paste this data in to the correspondence analysis directly.
Option B - Use the built-in statistical engine
Displayr has a powerful engine for computing tables (and charts) from raw data. Before you can use it, you must have a data set added to your document. To add your data, select Home > New Data Set.
Once you've got a data set loaded, use the following steps to create your table:
- Click Home > Table.
- Choose the data that you want to show in the table using the Rows and Columns menus (sometimes called Show and By depending on the type of data you have selected) in the Inputs > DATA section in the Object Inspector on the right side of the screen.
- Choose which statistic you want to analyze using Inputs > STATISTICS > Cells. You should select one statistic for your table.
For example, here I have created a table based on some data from a survey about technology brands. The table shows the devices people own across different income brackets.
Option C - Calculating tables with R
You can run your own custom R code in Displayr using Insert > R Output. The code that you run will depend greatly on what data you have at hand, and what kind of table you want to create. Examples include:
- Using a function like table() to calculate a contingency table.
- Using data.frame() to construct a data frame object containing raw data.
Whichever method you use, the mechanism for connecting the data to the analysis, described in the next section, is the same.
Step 2 - Run your analysis
Now that you have your data in a table, you can add a correspondence analysis output to your document:
- Select Insert > Dimension Reduction > Correspondence Analysis of a Table.
- Click into the Input table(s) box in the Object Inspector on the right, and select the table you have created above.
- Tick Automatic. This ensures the analysis will re-run itself whenever the input data or options change.
- Remove any additional rows which correspond to 'nets' or 'totals' by adding the corresponding row/column labels in the Rows to ignore and Columns to ignore sections on the right. These should typically not be included in the analysis, and Displayr automatically removes the default ones.
- Customize your title, colors, fonts, and grid lines using the settings on the right.
The map will appear as a scatterplot on your page.
For more on how to interpret a chart for a correspondence analysis, see How to Interpret Correspondence Analysis Plots (It Probably Isn’t the Way You Think).
A nice alternative to the standard scatterplot output of correspondence analysis is the moonplot. To display a moonplot:
- Go to the Object Inspector on the right and change the Normalization setting to Row Principal.
- Change the Output option to Moonplot.
The moonplot for my brand image grid looks like this.
The moonplot shows the rows of the table (the brands in this case) in the center, and the column of the table (in this case the attributes) around the edge of the circle. For reasons explained in Moonplots: A Better Visualization for Brand Maps, the moonplot can be easier to interpret than the standard chart.
Ready to make your own correspondence analysis? Click the button above, or sign up to Displayr for free here!