Creating Online Conjoint Analysis Choice Simulators Using Displayr
Displayr creates an online choice simulator with the click of a few buttons. In this post I describe how to create a simulator, customize both its appearance and its calculations, and provide access to the simulator for others.
Creating the simulator
- Create a choice model of the conjoint using hierarchical Bayes (HB) or latent class analysis in Displayr (Insert > More > Choice Modeling). You can also do this in Q and upload the QPack to Displayr.
- Click on the model and press Insert > More > Choice Modeling > Simulator and choose the number of alternatives that you wish to have in the simulator.
A new page will then appear beneath the page that contains your model, and this page will contain your simulator. For example, one created for a study on the fast food market looks like this:
Customizing the appearance of the simulator
You can customize all the various components of the simulator by clicking on them to modify them. For example, we’ve restyled the simulator above into the more attractive simulator below.
Customizing the calculations of the simulator
If you scroll down, you will find that below the simulator is a small table showing the simulated shares. If you click on the this table, you will see the underlying code, which has a straightforward structure.
The code for this simulator is a bit long, as it has five alternatives and 14 attributes. Consequently, I will illustrate the various ways you can customize calculations using a simpler study of the US chocolate market.
Predictions for every respondent
The predict method for choice models requires two inputs. One is a model (in the example below, choice.model.7). The other is a scenario, which indicates the names and attribute levels of the alternatives being simulated. If the study includes attributes, and they are left out of the scenario, the predictions are made under the assumption of parity.
Adding a ‘None of these’ to the predictions
Where you have a ‘None of these’ option in the data, and want to simulate its choices, you can do this by adding an extra alternative to the list of alternatives in the scenario (e.g., “Not buy” = c(“Alternative” = “4 (None of these)”)).
We can compute preference share, which is sometimes interpreted as market share, by computing the average of the respondents preferences:
Filtered preference share
There are two steps to applying filters. The first is to modify the code so it looks at filters, as done below. The second is to apply a filter, which can either be done in Displayr in edit mode or by viewers of published dashboards in view mode.
Weighted preference share
You can apply sampling and/or volume weights by modifying the code as below, and applying a weight to the R Output.
Rules for computing preferences
By default, Displayr computes preference share by computing using utilities for each respondent. This is the typical way that preference is simulated in choice simulators. However, it is perhaps not the best method. When preferences are simulated using respondent utilities, we are implicitly assuming that the utilities are estimated without error. This assumption is definitely not correct. As discussed in Performing Conjoint Analysis Calculations with HB Draws (Iterations), we can improve on our calculations by performing them using draws. To do this we need to:
- Modify our choice model so that it saves the draws: set Inputs > SIMULATION > Iterations saved per individual to, say, 100.
- Explicitly specify the rule = ‘logit draw’, as shown below
There are two other rules that can be used: ‘first choice draw’ and ‘first choice respondent’, which assume that people have a 100% probability of choosing the alternative with the highest utility.
Calibrating to market share
It is often necessary to adjust choice simulators to make their predictions more accurate. The brute force approach to calibrating to market share, is described in “How to Calibrate a Conjoint Analysis to Market Share“. However, I’d encourage you to first try the approaches in the next two sections, as they are more principled.
Calibrating to distribution and awareness
A more nuanced approach is to factor in knowledge at the sub-group level. For example, consider the situation where we know that Hershey will have 100% distribution, Lindt will have 50% distribution, and Godiva will have 10% distribution, and we also believe that the distributions are not correlated (i.e., we believe that Godiva is no more likely to be in a store if Lindt is in the store than if Lindt is not in the store).
The first step is to compute a new set of utilities for the respondent that are consistent with these distribution figures. I’ve illustrated this in the calculation below. This is a little tricky so I’ll walk you through the details:
- I start by generating a matrix (table), called d, with 403 rows, one for each respondent, and three columns, one for each brand in my simulation. I start with every value in this matrix being -Infinity.
- Then, I set the Hershey column to be all 0s, indicating that we have complete distribution (i.e., 0 indicates distribution for a respondent).
- Line 5 randomly assigns approximately half of the respondents a score of 0 for Lindt.
- Line 6 randomly assigns approximately 10% of respondents a score of 0 for Godiva (not shown).
- Line 1 ensures that each time the calculation is run, we get the same randomly generated data. Without this line, every time our calculations updated we would get a different set of results. While this would be good in that it would give us an understanding of the sensitivity of our conclusions to the random number generation, most people find it a bit unsettling if the results always change.
Then, we pass in these values via offset, as shown below. Note how the shares of the brands have changed massively.
We can also make more refined adjustments by:
- Producing correlated distributions via multivariate binary number generation (if you don’t know how to do this, best leave this one alone or hire an expert).
- Using other data. For example, if you know that people have not heard of a brand, you can manually set their values to -Infinity. Similarly, if you know they live in an area where the brand will not be distribution, you also manually adjust the value passed to offset.
Tuning predictions using the scale parameter
One of the cool things about choice models is that if respondents have a high degree of noise in their data we can scale the utilities to reduce the effect of this noise. Or, if we believe that in the real world there is more random noise, we can also scale the utilities to reflect this.
In the first example, I use a scale parameter of 1. This has no effect at all, as multiplying a number by 1 does not change it. Consequently, the share predictions are the same as those provided in the Preference shares section.
In this next example I have used a scale parameter of 0.5, which has the effect of “flattening” the preference shares (i.e., making them more equal). While here I am must showing the effect of manually changing the scale parameter, the trick in a real-world application is to either:
- Choose an overall scale parameter that makes the data consistent with market share.
- Work out a separate scale parameter for each respondent by maximizing the accuracy of predictions at the respondent level (e.g., of their last 10 purchases).
You can automate this by modifying the code described in How to Calibrate a Conjoint Analysis to Market Share. Please note that scale is evaluated after offset.
Providing access to the simulator for others
Once you have set up a simulator in Displayr, you can publish it as a web page (Export > Private Web page), choosing whether to allow it to be accessed by anybody with the URL, or, setting up password access.
About Tim Bock
Tim Bock is the founder of Displayr. Tim is a data scientist, who has consulted, published academic papers, and won awards, for problems/techniques as diverse as neural networks, mixture models, data fusion, market segmentation, IPO pricing, small sample research, and data visualization. He has conducted data science projects for numerous companies, including Pfizer, Coca Cola, ACNielsen, KFC, Weight Watchers, Unilever, and Nestle. He is also the founder of Q www.qresearchsoftware.com, a data science product designed for survey research, which is used by all the world’s seven largest market research consultancies. He studied econometrics, maths, and marketing, and has a University Medal and PhD from the University of New South Wales (Australia’s leading research university), where he was an adjunct member of staff for 15 years.