Using Substitution Maps to Understand Preferences in Conjoint Analysis

Using Substitution Maps to Understand Preferences in Conjoint Analysis

Modern tools for analyzing conjoint analysis, such as hierarchical Bayes, produce rich data showing preferences for each person in a market. The main deliverable from such research is a choice simulator. A practical challenge with choice simulators is that while they can answer any specific question, it is often hard to extract detailed insight from them about the underlying distribution of preferences. This post explains how to compute substitution maps, which allow an exploration of differences between alternatives (e.g., different products in a market or different levels of an attribute).

Step 1: Create a switching matrix

The first step is to create a switching matrix. A switching matrix typically shows brands switching (e.g., what proportion of Ford buyers next buy a Ford, versus BMW, versus General Motors, etc.). A switching matrix can be created from a conjoint choice simulator by first computing share predictions for some base scenario (e.g., current market conditions), making one of the alternatives less desirable (e.g., by raising its price), and seeing how its share changes, and then repeating this for the other alternatives. In the table below, each row shows the proportion of people who switch from a cuisine when its price is raised from $10 to $20 per meal. Looking at the first row, for example, we can see that 37% of Pizza choosers will continue to choose a pizza if the price is doubled, 16% switch to Chinese, etc.

While you can create a switching matrix by entering the scenarios and writing the numbers down, you can save a lot of time by using code to record the different scenarios. In Displayr, this is done by using Insert > R Output, pasting in the code below, and changing the first five lines to refer to the data of interest. A good tip when creating such a matrix is to always order the rows and columns according to market share or preference share, as it tends to make key patterns more obvious.

Step 2: Compute switching shares

The main diagonal of the switching matrix shows loyalty. We can see that 37% of Pizza choosers stayed loyal despite the $10 price rise, 30% of Chinese choosers were loyal, 38% Mexican, etc. In the table below, the loyalty data has been removed and each row has been re-based to add up to 100%.

Step 3: Compute magnetism

The key feature in the table above is that the numbers in each column are relatively similar. For example, all the numbers in the Pizza column are in the range of .21 to .51, whereas in the right-most column, most values are less than 0.03. Such a pattern is the norm in switching data. We can thus simplify this table a lot by just using the average of each column, which we can refer to as an alternative’s magnetism.

Step 4: Index the switching shares

The next table divides each row of the switching shares by the alternative’s magnetism. This produces a table where the higher the value, the greater the degree of switching (substitution) if the magnetism is equal. That is, this table reveals the switching that cannot be explained by magnetism alone. For example, reading down the second-last column of the table, we can see that ignoring magnetism, Thai is most likely to be substituted with Indian, followed by Chinese.

Step 5: Using multidimensional scaling (MDS) to summarize the indexed switching shares

MDS is useful tool for summarizing tables like the one above. In order to use MDS, we need to convert the table into a distance matrix, where bigger numbers indicate alternatives are further apart. There are two aspects to doing this:

  • We need to make the table symmetrical around the main diagonal. That is, the Indian-Thai switching is 3.13 and the Thai-Indian switching is 3.87, and if we replace each of these values by their mean of 3.5 we create a symmetrical table. In the code below I’ve computed the geometric mean to take into account that the underlying data are ratios, but the difference is trivial. In the case of Thai-Indian, the geometric mean is 3.48.
  • Taking the inverse of the values, so that we have smaller values for alternatives that are more similar.

The plot below shows the results of metric MDS applied to the table above (In Displayr: Insert > Dimension Reduction > Multidimensional Scaling, and selecting the distance.matrix created above as the Distance matrix.) This produces the map shown below. As is often the case with MDS, quite a lot of information is lost from the data due to the impossibility of showing all the information in the plot (the data requires six dimensions to be accurately described, but MDS plots only two). For example, the table above shows us that Thai and Indian are very strong substitutes, and we might anticipate that they would be on top of each other. However, Thai is also reasonably close to Chinese and Mexican, whereas Indian is a long way from each, so the resulting plot is a compromise of such effects. The biggest compromise to callout from the plot relates to Hamburgers, which are very similar to Pizza, Chicken, and Mexican, but not to Chinese, whereas each of Pizza, Chicken, and Mexican are close to Chinese, which again leads to a compromise position for Hamburgers. This emphasises that while this plot is a useful summary of overall patterns, it is not the whole story.

Step 6: Creating the substitution map

In the final step I’ve created a bubble chart, using the coordinates identified from the MDS, with the size of the bubbles proportional to magnetism.


To create the substitution map in Displayr, first create a table that contains the coordinates from the MDS and the magnetism. Then use Insert > Visualization > Scatterplot and select CHART > APPEARANCE> Show labels On chart. The code for creating the table is shown below.

About Tim Bock

Tim Bock is the founder of Displayr. Tim is a data scientist, who has consulted, published academic papers, and won awards, for problems/techniques as diverse as neural networks, mixture models, data fusion, market segmentation, IPO pricing, small sample research, and data visualization. He has conducted data science projects for numerous companies, including Pfizer, Coca Cola, ACNielsen, KFC, Weight Watchers, Unilever, and Nestle. He is also the founder of Q, a data science product designed for survey research, which is used by all the world’s seven largest market research consultancies. He studied econometrics, maths, and marketing, and has a University Medal and PhD from the University of New South Wales (Australia’s leading research university), where he was an adjunct member of staff for 15 years.