A crosstab is a table showing the relationship between two or more variables. Where the table only shows the relationship between two categorical variables, a crosstab is also known as a contingency table.
Example of a crosstab of two variables
The table below is a crosstab that shows by age whether somebody has an unlisted phone number. This table shows the number of observations with each combination of possible values of the two variables in each cell of the table. We can see, for example, that 185 people are aged 18 to 34 and do not have an unlisted phone number. Column percentages are also shown (these are percentages within the columns, so that each column’s percentages add up to 100%); for example, 24% of all people without an unlisted phone number are aged 18 to 34 in the sample.
The age distribution for people without unlisted numbers is different from that for people with unlisted numbers. In other words, the crosstab reveals a relationship between the two: people with unlisted phone numbers are more likely to be younger. Thus, we can also say that the variables used to create this table are correlated. If there were no relationship between these two categorical variables, we would say that they were not correlated.
In this example, the two variables can both be viewed as being ordered. Consequently, we can potentially describe the patterns as being positive or negative correlations (negative in the table shown). However, where both variables are not ordered, we can simply refer to the strength of the correlation without discussing its direction (i.e., whether it is positive or negative).
Crosstabs with more than two variables
It is common for crosstabs to contain more than two variables. For example, the table below shows four variables. The rows represent one categorical variable, which records brand preference, and the columns represent age and income-within-gender.
Crosstabs are routinely created with many more variables. For example, each row and each column may represent a different variable.
Key decisions when creating a crosstab
In addition to selecting which variables to include in a crosstab, it is also necessary to work out which statistics to show. In this example, column % and the sample size for each column is shown.
A second key decision is how to show statistical significance. The example above uses lettering, which indicates whether a column is significant to another specific column. Alternatively, tests can be used which show whether a cell is different from its complement.
In commercial research, the rows of a crosstab are historically referred to as stubs and the columns as banners.