So what is MaxDiff?
MaxDiff is a tool used to understand how people prioritize things.
In the case study we'll discuss today, we got people to prioritize different aspects of a cell phone of cell phone plans. We wanted to know what was most important to them when choosing a cell phone company or provider Other types of common Maxdiff alternatives are menu items, advertising claims, and promotional offers.
What you can see on the screen here are shortened descriptions of the alternatives we tested in the case study.
MaxDiff has its own special type of survey question. People are shown a subset of the alternatives and asked to choose the one that's most important or preferred and the one that's least important or preferred.
I'll give you a moment to review the example shown here.
People are typically asked from six to twelve MaxDiff questions where the subset of alternatives shown varies from question to question. By looking at how people's choices change based on the subset of alternatives shown, we can work out what's important to them.
In this example output, we have two different segments. Both prioritize price as most important, but the talker segment's number two priority is coverage, whereas the viewers have a much higher preference for streaming speed and mobile hotspot.
So MaxDiff is a measurement tool that's designed to measure how people prioritize different things There are four key steps in conducting a MaxDiff study, and I'm going to walk you through them.
Step one is creating the experimental design, that is, working out the specific MaxDiff questions to ask.
Step two is field work, that is collecting the data.
Step three is creating and running a model.
Typically, something called a hierarchical base model is used.
And, finally, step four is extracting and analyzing the utilities from the Hierarchical Bayes model to work out what patterns exist in the market
We'll start with working out the experimental design. An experimental design is a table of numbers just like this one shown here.
The first row of the experimental design tells us what goes in the first question. As we can see here, question one starts with alternative number one, which is voice quality. When we replace the alternative numbers with words, it gets a bit easier to read.
Here are instructions you can refer to later for both displayer and queue.
We'll now create an experimental design from scratch and you'll see it's not hard at all. As always, if you don't know where something is in display, you can just search for it.
By default, the software creates a design with eight alternatives. However, we have ten alternatives in our cell phone case study.
Some people create designs to test one hundred or more alternatives, but the fewer the number of alternatives, the better from a research quality perspective. I'll go ahead and change the number of alternatives from eight to ten.
You can see we get an orange warning saying something's amiss. If you read the MaxDiff ebook, you can learn all about the orange warnings. But the good news is you can play it just like a computer game where our goal is to get rid of the orange warnings.
Basically, the way the game works is that the more alternatives you test, the more alternatives we need to show in each question or the more questions we need to ask, so we collect enough data to produce a good model. Unless your alternatives are very wordy, you can usually have five alternatives per question, so I'll do that.
Note that a column for a fifth option has been added to the design, and we also got rid of the orange warning, but that doesn't mean we're finished.
At the moment, we have ten questions. The more questions, the more it costs to collect the data and the more bored respondents get. So let's see if we can get away with fewer questions to reduce data collection cost and respondent fatigue.
Okay. We get another orange warning. I'm actually gonna reduce the number of questions further. Let's see if we get lucky with six questions Great. No orange warnings. So I've done this before, and I can tell you that six questions is as low as we can go without orange warnings.
Now there's a whole lot of math you can do to figure out how many questions to ask, but trial and error, just like I've done here, works perfectly.
If you only need to do segmentation, then your best bet is to have one version of questions for all respondents. Otherwise, it's a good idea to have multiple versions. Versions are sometimes called blocks. Ten versions are probably enough, but I'll do one hundred versions just to be safe. As you can see, the design has grown and now has six hundred rows.
So let's say we were going to interview three hundred people. We'd assign three people to each version where each version is a separate set of six questions or rows in this table, and the difference between them is which five alternatives appear in which questions.
Lastly, I'll change repeats to ten. This usually does nothing, but it can improve things just a bit.
Step two is fieldwork or data collection. Our software will do the experimental design, modeling, and analysis for MaxDiff, but we don't do the fieldwork or data collection.
However, here are some data collection tips. I won't read these tips, but I'll give you a moment to scan through them. Again, we'll share the link to this document later this week.
Once we've collected our data, we need to get a data file. An SPSS or Triple S data file is best.
Using the experimental design that I just showed, we went ahead and collected three hundred interviews. I'll add the data file now.
The next step is to estimate a Hierarchical Bayes model.
Here are our instructions for Displayr and Q. I'll give you a moment to review.
First, I need to hook up the experimental design that I created to the model. We also need to hook up which version of the design respondents saw. It's mission critical that you have a variable in the data file that records which version each respondent saw.
For example, hovering over the version variable in the data file, we see that the first respondent saw version forty seven, while the second respondent saw version fifty nine.
Here are the six variables that store which alternatives people selected as most important in the six questions. And here are the six variables that store their least important selections for the six questions.
The model will take a few moments to compute, so let me show show you how we do this in queue while we wait.
As shown in the instructions, we create the experimental design this way. You go to create, marketing, max diff, and experimental design. And we'll use the same exact settings that we've used in displayer for the design so ten alternatives five alternatives per question six questions, one hundred versions, and ten repeats Here we go.
We also run the Hierarchical Bayes model in exactly the same way. So just go back to create, marketing, MaxDiff, and now select Hierarchical Bayes.
And just like in Displayr, you need to hook it up to the design. You need to point it to the variable that records each respondent's version and then select the six best or most important variables.
And, lastly, select the six worst or least important variables, and then you're off and running.
Alright. Let's jump back to Displayr. And you can see the results from the hierarchical Bayes model.
Okay. We've got some orange warnings. They're telling us to run some more iterations. Be a bit careful here. It's only in recent years that it's become clear that a lot of the older software doesn't run for enough iterations. I'll set the number of iterations to one thousand.
The model takes a few minutes to run, so I went ahead and ran it beforehand. Let's take a look at the results.
The Mean column shows the average importance for each alternative. The highest mean is for price, which tells us that it's most important on average.
If we look at the histogram to the left, though, we can see that people vary quite a bit. Blue means above average, and red means below average. So in the case of price, we can say that most people consider it to be above average in importance. But let's contrast that with premium entertainment.
The average is low, and just about everybody has a score below zero, so it's unimportant to almost everybody.
Now this output that we see here is too complicated for most clients, so we need to extract the utilities, which are the underlying measures of the importance of each alternative.
I'll go ahead and do that.
As you can see, the utilities were added to the data file as variables.
So how do we analyze the utilities? Here are the instructions for saving utilities, which I just did a few seconds ago. I'll give you a moment to review.
We can show the average utilities in a summary table like this one here, but this can lead to painful and unproductive discussions when clients ask what exactly the numbers mean. So we've developed a special chart just for this, which we call a bump chart. It's great for showing relativities.
Okay. So price is most important followed by coverage. Let's see if this differs by gender.
Nope. No differences by gender. Let's see if there are differences by age.
We see when comparing across age or by age that the most important alternatives, price and coverage, are consistent across age groups, but we see that streaming speed, in particular, is less important to the oldest age group.
Some people love to use MaxDiff utilities for TURF analysis. If you're one of these people, this is how you do it. First, we convert the zero-centered utilities that we saved into ranks. And they're gonna be ranks within case. We then convert the ranks into binary variables that count each respondents' top two ranked alternatives.
I'll now go and start the TURF, And we just use the binary variables as the alternatives for the term.
Okay. So the best two alternative portfolio is pricing coverage followed by price and streaming speed.
If you wanna see the top three alternative portfolio, just increase the portfolio size to three.
You can check out our TURF webinar and ebook for more detailed info about TURF and how to run it.
We can also use MaxDiff utilities as inputs for a segmentation, but there's a special segmentation method just for MaxDiff that's very easy to run. We just duplicate the Hierarchical Bayes model, And I'll quickly change the title. And after duplicating the Hierarchical Bayes model, we just change its settings. So instead of doing a Hierarchical Bayes model, we can do a latent class analysis, and I'll set the number of classes to four.
There we go.
Once the latent class analysis MaxDiff model is done run running, we just need to save each respondent's class or segment membership by clicking right here, the class membership button. And you see the class membership variable is added to the data file. And here are instructions for running a latent class analysis MaxDiff model.
Great. Now let's move on to your questions and answer as many of them as possible in the remaining time. Just give me a few seconds to switch gears and review the questions.
Okay. So someone asked if the recording will be available after the session. The answer is yes. Just keep an eye out for an email with the link to the recording and also a link to the document.
Okay. Alright. So someone asked if they need to use displayer to analyze the data in queue. So they have queue, but they don't have displayer, and they don't plan to get displayer. Okay? So the good news is you don't need to, displayer. You can do everything that I've shown you here in queue.
Okay. So Eric asked, what is the maximum and minimum number of attributes that you recommend?
So great question, Eric. You know, there's no hard and fast rule, but the one thing I would say is, you know, you do need to be very prudent and make sure that you're only including attributes or alternatives that you absolutely need to test. Okay? If you're not sure about an attribute and whether or not to include it, I would lean towards not including it. Again, you only wanna test attributes or things that you absolutely need to test.
Okay? Like I said earlier, some people do test upwards of one hundred attributes in a max diff, but the more attributes you have, you know, the more you're gonna ask of respondents, you do run the risk of respondent fatigue.
Alright. So Guy asked, is Hierarchical Bayes the best model to use? And he said, I saw in the list let me see. I just need to make it a little bigger. A few different types of analysis.
Good question, Guy. The answer is yes. It is, you know, Hierarchical Bayes. That is the gold standard for estimating a MaxDiff model and MaxDiff utilities. But if you are looking to do segmentation, in both display or in queue, you do have the weight in class analysis, which basically will, you know, run the MaxDiff model and perform your segmentation at the same time. But, again, you know, just to answer your question, hierarchical Bayes is the best model to run.
Okay. So someone asked about anchored MaxDiff.
And they asked, alright, does Displayr handle anchored MaxDiff?
The answer is yes. You can do anchored MaxDiff in Displayr and in Q, but it does it is quite involved and a lot more involved than running, you know, executing a standard MaxDiff in Displayr or Q.
Okay. So Joanne asked why one hundred versions in the experimental design with three hundred respondents?
Like I said, Joanne, I think you could get away with maybe, you know, ten versions or blocks in the design, but, you know, it can't hurt to have more. So, you know, we just bumped it up all the way to one hundred.
The key thing is when you have multiple versions in your design, you wanna make sure that each version is seen or evaluated by roughly an equal number of respondents.
And, yeah, Joanne said, hey. If you have a hundred versions and three hundred respondents, will each version be only seen by a few people? And the answer is yes. So if you have fewer versions, more people will see each version or evaluate each version.
Michael asked, is the number of iterations based on the number of respondents that take the survey?
Good question, Michael. The answer is no. Not for Hierarchical Bayes model for MaxDiff. It is you know, modeling is trial and error. So Displayr and Q starts with one hundred iterations. And if you run a model with that many iterations and then, you know, you get the orange warning saying, hey. You know, you need to run more. We advise, you know, doubling it. So try two hundred, see how it goes. And if two hundred is not sufficient, then double that. Okay?
So Kevin asked, can you do conjoined studies in display or in queue? The answer is yes. Just like for MaxDiff, you can use Displayr and Q to generate an experimental design for conjoint and estimate your utilities. And in Displayr, you can create a preference simulator.
Okay. Let's see. Just give me a second. Review a few more questions.
How to do TURF in Q? It's really straightforward. I can follow-up with you with instructions for doing that.
So, Stephen asked, has the ebook been updated since the 2020 version? The answer is yes.
So Brendan had a great question about rescaling, you know, the MaxDiff utilities.
So with values that sum to one hundred, he asked how, to do this. Brendan, I'll happy to I'm happy to follow-up with you. Great question, and I'll provide I Well, I think I covered most of the questions, and I see now that we're at the end of our time. So thank you, everyone, again, for joining the webinar.
And for those who asked questions that I didn't get to, just keep an eye out for an email so I will follow-up with you and answer your questions via email. And, yeah, I just wanna, you know, thank everyone again, and, I hope everyone has a great rest of your day. Take care.