The conjoint webinar series
We are doing a deep dive into conjoint over four webinars. In the last webinar we looked into calculating utilities and creating simulators. Today we find the story in the data.
Average utilities - among...
In the previous webinar, we focused on a case study looking at job choice.
Rather than look at all the data, I'm going to focus on the data from a segment of 454 people who are able to do their work from home.
I'm doing this because this segment's data is a bit noisy, and that will allow me to make a few hopefully interesting points.
In the previous webinar we talked about how it's often useful to scale utilities so that each attributes' least appealing level has a utility of 0. You can see, though,, that this hasn't happened here.
E.g., current salary has a utility of 5.
I will explain why a bit later.
Average utilities as a chart
It's much easier to read as a chart. The traditional way of showing utilities is via line charts like this.
Average utilities scaled to have a minimum of 0
It's easier for clients if we rescale it so each attribute has a minimum of 0.
There's a really great insight here. A 10% pay rise has 6 times the utility of a 5% pay rise. So, all you managers out here, fight for 10% to combat the inflation this year!
The next attribute looks at an employer's plans to become carbon neutral.
Here we see some inconsistencies, For example, on average, people prefer 30 years to 20 years for carbon neutrality.
Surveys always contain errors. When an ordered attribute is relatively unimportant - that is, when the difference in utility between the most and least preferred level is small - the errors become more obvious.
I was really surprised by the importance of software. If you are making your team use poor software, it has the same pain to them as 13% salary!
Looking at location, the more remote, the happier people are. But, it's a diminishing effect.
Column and bar charts are also commonly used for viewing utilities.
Here's an example for 2 ounce chocolate bars.
A practical problem with showing utilities is that the scale is arbitrary. As discussed in the previous webinar, we can rescale in various ways. But, ultimately, what do the numbers mean?
One way of making the scaling less subjective is to report in terms of willingness to pay. This is also known as dollar-metric scaling.
We have researched a price range of $2.49 minus $0.99 which is $1.50
From this chart, we can see that a utility of 1.48 equates to a saving of $1.50.
So, it follows that our scale is that 1 utility = $1.01 cents.
We can thus rescale all the utilities in terms of dollars.
For example, we can say that utility of standard chocolate versus sugar free is $1.46.
This is sometimes used as a way of quantifying brand equality. In this case, the analysis would say that Hershey's brand equality relative to Lindt is 1.12 cents per bar.
It's a sexy thing to do as clients feel they understand it.
But, in my experience, they usually misunderstand it in the worst possible way, thinking that the dollar values can be used to determine pricing.
Why can't they?
Look a the sugar attribute. We are showing that people will pay a premium for standard chocolate over sugar free. If we were to believe that these values reflected the prices we should charge, it would mean that sugar free chcolate should be $1.46 cheaper than normal chocolate.
But, if you have ever bought sugar free chocolate, you will know the opposite is true. It costs much more.
In practice when we price a product, we do so far a segment of people, but this analysis implicitly ignores this.
A second problem is more subtle. The technical meaning of the $1.46 WP for Standard Chocolate is that this is the average you receive if you charged each person a different price, where that price was the highest price you could possibly charge. In the real world there's no way to do this, as we don't know what people will pay and there are competitors.
So, while appealing, I recommend you don't do it.
We talked about how to create simulators last webinar.
Now for the fun stuff!
Job choice simulator
Simulators are great when you know what scenarios are important.
Currently, lots of companies are wanting employees to return to the office. Let's see what the preference is.
Job offer 1
- Work location
- Work in the office every day
Ah, so if competing against companies that allow fully remote wok. We're going to lose 2/3 of potential candidates.
What can we do to rectify this?
What about a pay rise?
Job offer 1
A 5% pay rise won't do much. Remember, we learned before that we need to do 10%.
OK, so if we want people to return to work, we need to pay them an extra 15% on average to make up for the extra hassle.
Rather than manually play around, we can use opimizers, to calculate the share.
Edit mode > Hierarchical Bayes...
We create it in much the same way as a simulator.
Job offer 1
- Work location
- On: Fully remote
- On: Work in office every day
Rather than manually go through all the options, an optimizer just does it automatically.
Let's look at all the salary options
Alternative 1: Salary > Select all
So, these are the same numbers as before, but we got there a lot faster.
A demand curves show how our demand relates to price.
We can use the simulator and run different price scenarios.
Or, we can use the optimizer like we just did.
Demand curve for working from the office
Note that this is a bit more useful than just running the simulations, because we can interpolate in between points.
For example, it looks like people become indifferent - which means half will choose - between working form the office with a 14.5% pay rise versus working from home.
Demand curve for low sugar chocolate
Here's a demand curve for sugar free chocolates.
The horizontal axis shows price. We can see that at $0, only about 32% of people are predicted to want sugar free chocolate.
This makes sense. Most people prefer chocolate with sugar.
But, from this chart we can see that around 26% of the market would pay a 60 c premium.
And, 20% a $1 premium.
Charts like this one and the simulator are the magic of conjoint.
They give us very clear answers to business questions. We can get a very clear understanding of how people trade off between price and product features by asking relatively easy to understand questions, and doing a lot of math in the background.
Here the data tells us that sugar free is not a mass market proposition, but it is a feature we can charge a very large price premium for.
What about if we optimize both carbon neutrality and salary.
Alternative 1: Carbon neutrality > All
Note that there are 30 rows here. Let's sort them.
So, the optimal combination of salary and carbon neutrality as a compensation for working from the office is apparently to offer 20% higher salary and carbon neutrality in 5 years
Indifference curves add another dimension to a demand curve.
This first point shows the same result we saw on the demand curve. We get a preference of 50% when we have 14.5% pay rise, but no carbon neutrality.
This line shows other combinations of salary and carbon neutrality that have this same 50% preference share. We can see that if we are already carbon neutral, we only have to offer 11%.
Some of you might be thinking that we are even better off if we only promise 5 years carbon neutrality, but this is likely jus research error.
Select all for two remaining attributes
But, what's he optimal combination of everything?
We've now gone through 360 combinations. Our best option is apparently
- 20% pay rise
- 5 year carbon neutrality
- Best software
- Fully remote
On the left we have our average utilities.
Remember our little conundrum. In the previous webinar, we scaled each person's utility to have a minimum value of 0 for the lowest level of utility. But, our average of these for the the lowest salary level is 5, rather than 0.
Due to noise, not everybody has current salary as their lowest level. Some respondents have 20% as the least appealing level. Perhaps they are signalling that they won't take lots of money at the expense of the environment. Perhaps it's just noise. We can't ell.
Notice we have bumps at 0 and 100. This is because we scale the data between 0 and 100.
The bumps are most pronounced for the lowest and highest levels of an attribute. For the in-between levels we have much wider distributions, which reflects variation between people.
Let's look at correlations
Correlations of utilities
Now I've correlated each person's utilities for the job choice data.
The dark blue diagonal line is showing that each attribute is perfectly correlated - dark blue is a correlation of 1 - with itself.
There are a few standard pattens e expect to see.
With attributes with a natural ordering, we will see stronger correlations between adjacent levels.
And, negative correlations between the more distant.
Point to Salary +20% in first column
For example, here we see that the people that really want a salary increase of 20%, are then least likely to have a high utility for current salary.
Point to carbon neutral segmentation.
Here's our first really interesting pattern. There's a high correlation between each of our first hree carbon neutral levels. The people the people that gave a low utility for no plan, also gave low utilities for 30 years and 20 years.
And, here's another interesting result.
The three levels which involve some visiting the office are highly correlated.
The one that stands out is fully remote. Note that there's even a negative correlation between fully remote and primarily remote.
Sometimes you have studies where your attributes have lots of levels.
Utilities tell you which ones are more preferred. But, a separate question is which ones are substitutes.
Substitution maps for fast food
Looking at the fast food market, where an attribute was food type, we can see here that say, Chicken meals and Pizza are closer substitutes than Chicken and Thai.
It's a pretty exotic topic, so I will refer you to the post if you want to create them yourself.
Another way to explore utilities for levels of an attribute is via cluster analysis.
Look at salary levels
It's sometimes useful to summarize all the utilities into a smaller set of numbers, which are referred to as importance scores.
Importance is the difference
Here's the utilities plot .
Importance usually is defined as the difference between the most appealing and least appealing level of each attribute
So, you would say here that price is the most important attribute, followed by sugar level, cocoa strength and so on.
But, there's a bit of a trap here. And, it's a big trap, which means that I stopped reporting importance many years ago.
Compare price and brand. We've said that Price is more important than brand.
But, this importance is just determined by which attribute levels we had tested. If we'd only tested $1.99 and $2.49, we would instead be saying it's the other way around and brand is much more important than price.
This distinction is very important. But, in my experience when you talk about importance, it just gets lost, and people misunderstand the results., drawing lots of conclusions that aren't in the data.
Rather than defining importance as based on the full range of levels tested, I find it's more useful to define it based on what's strategically relevant.
Let's say our client is Godiva, and they are selling a 2 ounce bar with
- Standard sugar
- Dark chocolate
- No nuts
- Made in Belgium
- No fair trade offer
They could change from Dark to Milk, but it's not that important. Its importance is now defined as 9.
Similarly, the importance of almonds s only 16
Segmentation: crosstabs and clustering
Understanding segmentation is often very useful.
There are two basic strategies.
Crosstabs and clustering
Crosstab via importance